Computer 
Organization 

and 
Microprogramming 


УАОНАМ СНУ 


Professor of Computer Science 
University of Maryland 


Computer 
Organization 
and 


Microprogramming 


PRENTICE-HALL, INC., Englewood Cliffs, New Jersey 


© 1972 
by Prentice-Hall, Inc., Englewood Cliffs, М. J. 


All rights reserved. No part of this book 
may be reproduced in any form or by any means 
without permission in writing from the publisher. 


10987654321 


ISBN: 0-13-166025-Х 
Library of Congress Catalog Card Мо. 79-39306 


Printed in the United States of America 


PRENTICE-HALL INTERNATIONAL, INC., London 
PRENTICE-HALL OF AUSTRALIA, PTY. LTD., Sydney 
PRENTICE-HALL OF CANADA, LTD., Toronto 
PRENTICE-HALL OF INDIA PRIVATE LIMITED, New Delhi 
PRENTICE-HALL OF JAPAN, INC., Tokyo 


То my wife, Elizabeth 


Contents 


Chapter 1 Computer Design Language 1 


1.1 Declaration Statements 2 


1.1.1 Register Statements 2 
1.1.2 Memory Statements 4 
1.1.3 Switch Statements 4 
1.1.4 Light Statements 5 
1.1.5 Terminal Statements 5 
1.1.6 Clock Statements 6 


1.2 Micro-statements 7 


1.2.1 Basic Operators 7 

1.2.2 Expressions 8 

1.2.3 Micro-operations 8 

1.2.4 Conditional Micro-statements 10 
1.2.5 Special Operators 11 


1.3 Sequencing 12 


1.3.1 Execution Statements 12 

1.3.2 Sequencing by a Multiple-phase Clock 13 
1.3.3 Sequencing by a Control Register 14 

1.3.4 Sequencing by Multiple Control Sequences 14 


1.4 Description of a Stored-program Computer 16 


1.4.1 Configuration 14 

1.4.2 Formats 18 

1.4.3 Instruction Set 18 

1.4.4 Sequence Chart 19 

1.4.5 Statement Description 20 

1.4.6 Data and Control Paths 22 

1.4.7 An Example of Using Special Operators 24 


References 27 


Problems 28 


vii 


viii 


CONTENTS 


Chapter 2 Some Organizations 31 


2.1 


2.2 


2.3 


2.4 


2.5 


2.6 


2.7 


А Serial Parity Generator 32 


2.1.1 


Generation of a Parity Bit 32 


2.1.2 Configuration 33 


2.4.3 
2.1.4 


Sequence Chart 34 
Sequence Description 35 


Serial Comparators 35 


2.2.1 
2.2.2 


2.2.3 


2.2.4 


2.2.5 


Serial Comparison of Two Binary Numbers 35 
Configuration for Comparing Unsigned Binary 
Numbers 36 

Sequence Chart for Comparing Unsigned 
Binary Numbers 38 

Sequence Description for Comparing Unsigned 
Binary Numbers 38 

A Serial Comparator for Signed Binary 
Numbers 40 


Finding the Largest Number 42 


2.3.1 
2:3:2 
2.3.3 


An Algorithm of Finding the Largest Number 42 
Macro Design—Version A 42 
Macro Design—Version B 47 


A Prime Number Generator 50 


2.4.1 
2.4.2 
2.4.3 


Generation of Prime Numbers 50 
“Software” Description 51 
“Hardware” Description 52 


A Gray-to-binary Code Converter 56 


2.5.1 
2.5.2 
2.5.3 
2.5.4 


Gray Code to Binary Code Conversion 57 
Configuration 59 

Sequence Chart 60 

Sequence Description 60 


A Binary-to-decimal Code Converter 61 


2.6.1 
2.6.2 
2.6.3 
2.6.4 


Binary-to-decimal Number Conversion 61 
Configuration 62 

Conversion Process 64 

Sequence Descriptions 64 


Stored-carry Addition 67 


2.7.1 
2.7.2 
2.7.3 
2.7.4 


Mercer’s Addition Algorithm . 67 
Statement Description 67 
Description of a Zero Test 70 
Description of a Comparison Test 71 


CONTENTS 


2.8 А Bowling-score Computer 71 


2.8.1 
2.8.2 
2.8.3 
2.8.4 
2.8.5 


Rules for Playing a Bowling Game 71 
Computation of Bowling Score 73 
Configuration 74 

Sequence Charts 76 

Sequence Description 79 


References 80 


Problems 87 


Chapter 3 Microprogramming 83 


3.1 


3.2 


3.3 


3.4 


A Parity Generator 84 


3.1.1 
3.12 
3.1.3 
3.1.4 
3.1.5 
3.1.6 


Microprogram Control Configuration 84 
Control Word Format 85 

Sequence Description 87 
Microprogram 88 

Control Cycle 88 

Comparison 90 


A Microprogrammed Computer 90 


3.2.1 
3.2:2 
3.2.3 
3.2.4 
3.2.5 


Configuration 90 
Control Signals 92 
Control Word Format 93 
Sequence Description 93 
Microprogram 96 


A Stored Logic Computer 98 


3.3.1 
3.3.2 
3.3.3 
3.3.4 
3.3.5 
3.3.6 


Configuration 98 
Sequential Operations 100 
Control Signals 101 
Control Word Format 101 
Sequence Description 102 
Microprogram 104 


A Microprogrammed Sequence 104 


3.4.1 
3.4.2 
3.4.3 
3.4.4 
3.4.5 


Microprogram Control Configuration 105 
Timing and Control Signals 106 

Control Word Format 108 
Microprogramming the Sequence 108 
Microprogram 111 


References .112 


Problems 113 


CONTENTS 


Chapter 4 А Fixed-point Arithmetic Unit 115 


4.1 Configuration of the Arithmetic Unit 116 


4.1.1 Fixed-point Number Representation 116 
4.1.2 Configuration 117 

4.1.3 Parallel Adder 118 

4.1.4 Terminals Z 120 

4.1.5 Operator Add2 120 


4.2 Arithmetic Algorithms 120 


4.2.1 Fixed-point Addition and Subtraction 121 
4.2.2 Fixed-point Multiplication 124 
4.2.3 Fixed-point Division 128 


4.3 CDL Descriptions 132 


4.3.4 Timing and Control Signals 133 

4.3.2 Addition and Subtraction Sequence 135 
4.3.3 Multiplication Sequence 136 

4.3.4 Division Sequence 138 


4.4 А Parallel Adder with Group and Section Carries 139 


4.4.1 Single-bit Full Adder 139 

4.4.2 Organization of the Parallel Adder 140 

4.4.3 Terminal Statements for the Parallel Adder 143 
4.4.4 Group Carries 147 

4.4.5 Section Carries 150 


4.5 Design Considerations 151 
4.5.1 Double-rank Registers 151 
4.5.2 Fixed-point Arithmetic Instructions 152 


4.6 Microprogramming the Arithmetic Unit 154 


4.61 Microprogram Control Configuration 154 

4.6.2 Timing and Control Signals 156 

4.6.3 Control Word Format 158 

4.6.4 Microprogramming the Arithmetic Sequences 158 
4.6.5 Microprogram 165 


References 165 


Problems 166 


Chapter 5 A Floating-point Arithmetic Unit 169. 


5.1 Configuration of the Arithmetic Unit 170 


5.1.1 Floating-point Number Representation 170 
5.1.2 Characteristic Part 171 


CONTENTS 


Chapter 6 


5.2 


5.3 


5.4 


5.5 


5.1.3 Fraction Part 172 
5.1.4 Configuration 172 


Floating-point Addition and Subtraction 174 


5.2.1 Initialization 175 

5.2.2 Characteristic Alignment 176 

5.2.3 Fraction Addition and Subtraction 177 
5.2.4 Normalization 180 


Floating-point Multiplication 181 


5.3.1 Initiation 183 

5.3.2 Characteristic Addition 183 
5.3.3 Fraction Multiplication 183 
5.3.4 Normalization 185 


Floating-point Division 186 


5.4.1 Initiation 187 

5.4.2 Dividend Alignment 188 
5.4.3 Characteristic Subtraction 190 
5.4.4 Fraction Division 190 


CDL Descriptions 193 


5.5.1 Timing and Control Signals 193 

5.5.2 Floating-point Addition and Subtraction 194 
5.5.3 Floating-point Multiplication 198 

5.5.4 Floating-point Division 200 


References 203 


Problems 204 


Serial Arithmetic Units 207 


6.1 


6.2 


6.3 


Configuration of a Binary, Serial Arithmetic Unit 208 


6.1.1 Number Representation 208 
6.1.2 Full Adder-subtracter 209 
6.1.3 Configuration 210 


Binary Addition and Subtraction 212 


6.2.1 Algorithm 212 

6.2.2 Overflow Condition 213 
6.2.3 Configuration 214 
6.2.4 Sequence Charts 214 


Binary Multiplication 218 
6.3.1 Algorithm 218 


xi 


xii 


6.4 


6.5 


6.6 


6.7 


CONTENTS 


6.3.2 Configuration 219 
6.3.3 Sequence Chart 219 


Binary Division 221 


6.4.1 Algorithm 221 

6.4.2 Divide-stop Condition 221 
6.4.3 Configuration 224 

6.4.4 Sequence Charts 224 


Statement Descriptions 226 


6.5.1 Control Configuration 227 

6.5.2 Addition and Subtraction Sequences 228 
6.5.3 Multiplication Sequence 229 

6.5.4 Division Sequence 230 


Organization of a Decimal Arithmetic Unit 230 


6.6.1 Binary Coded Decimal Numbers 231 

6.6.2 Modes of Operation 232 

6.6.3 Decimal-digit Adders and Subtracters 234 
6.6.4 А Decimal Serial-digit Arithmetic Unit 238 
6.6.5 Decimal Addition and Subtraction 240 
6.6.6 Decimal Multiplication 241 

6.6.7 Decimal Division Algorithm 245 


Decimal Multipliers and Dividers 247 


6.7.1 A Multiplier Using the Nine Multiples of 
Multiplicand 248 

6.7.2 A Multiplier Using the Doubling-and Halving 
Method 249 

6.7.3 A Multiplier Using a Built-in Multiplication 
Table 250 

6.7.4 A Divider Using the Nine Multiples of the 
Divisor 251 


References 252 


Problems 252 


Chapter 7 Memory Organization 255 


7.1 


Random Access Memory 256 


7.1.1 Array Organization 256 
7.1.2 Module Organization 258 
7.1.3 Crossbar Switch 259 


CONTENTS 


7.2 


7.3 


7.4 


7.5 


7.6 


7.7 


7.1.4 Multiple-access Organization 260 
7.1.5 Types of Random Access Memories 261 


Memory Addressing 263 


7.2.1 Immediate, Direct and Indirect Addressing 264 
7.2.2 Indexed Addressing 266 

7.2.3 Relative Addressing 267 

7.2.4 Base Addressing 267 

7.2.5 Register Addressing 268 


Memory Stack 269 


7.3.1 Stack Organization 269 
7.3.2 Stack Operation 270 
7.3.3 Stack Adjustment 271 


Associative Memory 273 


7.4.1 Memory Organization 273 

7.4.2 Match on a Numerical Argument 276 
7.4.3 Match on а Boolean Argument 283 
7.4.4 Match on a Count Argument 283 


A Dynamic Loader 284 


7.5.1 Loader Organization 284 

7.5.2 Allocation and Loading 289 

7.5.3 Instruction Sequencing 294 

7.5.4 Operand Address Fetch 294 

7.5.5 Branching, Indexing and Indirect 
Addressing 295 

7.5.6 Program Return and Storage Release 295 


Memory Buffer 295 


7.6.1 Memory Buffering 296 

7.6.2 Buffering Organization 298 
7.6.3 Buffer-access Sequence 304 
7.6.4 Performance Evaluation 309 


Virtual Memory 311 


7.7.4 Basic Concept 312 
7.7.20 Paging 315 

7.7.3 Segmentation 317 
7.7.4 Segmented Paging 319 
7.7.5 Scheduling 322 


References 324 


Problems 327 


xiii 


xiv 


Chapter 8 Control Organization 329 


8.1 


8.2 


8.3 


8.4 


8.5 


Sequential-logic Control Organization 330 


8.1.1 Single-step Control 330 

8.1.2 Single-sequence Control 331 
8.1.3 Multiple-sequence Control 331 
8.1.4 Timing 332 


Microprogram Control Organization 335 


8.2.1 A Microprogram Control Unit 335 

8.2.2 Timing 337 

8.2.3 Control Hierarchy 338 

8.2.4 Control Word Format 340 

8.2.5 A Microprogrammed CPU 342 

8.2.6 A Microprogrammed I/O Control Unit 349 


Central Control Organization 353 


8.3.1 Sequencing 353 
8.3.2 Addressing 356 
8.3.3 Priority Interrupt 359 
8.3.4 Interrupt Sequence 363 


Asynchronous Control Organization 368 


8.4.1 Memory Unit 369 

8.4.2 Input-Output Unit 372 
8.4.3 Central Processing Unit 376 
8.4.4 Interfaces 379 


Computer System Configuration 382 
8.5.1 Channel 382 

8.5.2 Control Unit 386 

8.5.3 Modular Configuration 389 
References 393 


Problems 394 


Chapter 9 Computer Organization 397 


9.1 


System Units 398 


9.1.1 Main Memory 398 

9.1.2 Central Processing Unit 398 

9.1.3 Channels 400 

9.1.4 I/O Devices and Control.Units 400 
9.1.5 Operating System 400 


CONTENTS 


CONTENTS 


9.2 


9.3 


9.4 


9.5 


9.6 


Formats and Codes 401 


9.2.1 
9.2.2 
9.2.3 
9.2.4 


Data Formats 401 

Data Representations 403 
Data Codes 404 
Instruction Formats 409 


A Computer Organization 411 


9.3.1 
9.3.2 
9.3.3 
9.3.4 
9,3,5 
9.3.6 
9.3.7 
9.3.8 
9.3.9 


Data Flow 411 

Storages 412 

Registers 417 

Stats 419 

Buses 420 

Channels 421 

1/O Interface 422 

ГО Control Units and Devices 422 
Channel-to-Channel Adapter 423 


Processing Unit 423 


9.4.1 
9.4.2 
9.4.3 
9.4.4 
9.4.5 


Main 
9.5.1 
9.5.2 
9.5.3 
9.5.4 
9.5.5 
9.5.6 


Instruction Set 423 
General-Purpose Registers 426 
Processing Modes 426 
Processing Operations 427 
Code Translation 427 


Control Unit 429 


Instruction Sequencing 429 
CPU Status 430 

Program Status Word 431 
Interrupt 434 
Microprogram 437 
Interrupt Supervisor 437 


Supervisor and other Controls 438 


9.6.1 
9.6.2 
9.6.3 
9.6.4 
9.6.5 


Storage Protection 438 
Interval Timer 440 

Wait State 440 

Direct Control 440 

Initial Program Loading 440 


References 447 


Problems 441 


Chapter 10 Channel Organization 443 


10.1 


I/O Control Organization 444 


10.1.1 


Direct Program Control 444 


10.1.2 ГО Data Buffering 446 


ХУ 


хм 


10.2 


10.3 


10.4 


10.5 


CONTENTS 


10.1.3 Data Channel 447 
10.1.4 Multiplexor Channel 450 
10.1.5 I/O Processor 452 


Channel Operation 453 


10.2.1 Channel Address Word 454 
10.2.2 Channel Command Word 454 
10.2.3 Channel Commands 455 
10.2.4 Channel Program 456 

10.2.5 Channel Status Word 457 
10.2.6 Unit Control Words 458 
10.2.7 Program Status Word 459 
10.2.8 I/O Instructions 460 

10.2.9 I/O Interrupts 460 


ГО Interface 462 


10.3.1 Interface in a Multisystem 463 

10.3.2 Interface Lines 464 

10.3.3 Sequence Controls 466 

10.3.4 Address, Command, Status, and Sense Bytes 470 
10.3.5 Interface Sequences 472 


Selector Channel 476 


10.4.1 Unit Control Word 476 
10.4.3 Data Flow 477 

10.4.3 Data Service Operation 479 
10.4.4 Start I/O 481 

10.4.5 Other I/O Instructions 484 


Multiplexor Channel 485 


10.5.1 Unit Control Word 486 
10.5.2 Test Channel 487 

10.5.3 Halt I/O 487 

10.5.4 Test ИО 487 

10.5.5 Start I/O 490 


References 494 


Chapter 11 Microprogramming Software 495 


11.1 


Translation of Relocatable Code into 
Execution Code 496 


11.1.1 Relocatable Elements 496 
11.1.2 Executable Code 500 
11.1.3 Translation Algorithm 502 


CONTENTS 


11.2 


11.3 


Configuration 505 


Sequences 507 


11.3.1 Initialization Sequence 511 

11.3.2 Fetch Sequence 511 

11.3.3 Unpacking Sequence 512 

11.3.4 Address Modification Sequence 512 


Microprogram Control 513 


11.4.1 Control Configuration 513 
11.4.2 Timing and Control Signals 513 
11.4.3 Control Word Format 516 


Microprogramming 516 


11.5.1 Unpacking Sequence 516 

11.5.2 Initialization Sequence 520 

11.5.3 Fetch Sequence 521 

11.5.4 Address Modification Sequence 522 
11.5.5 Microprogram 525 


References 527 


Problems 527 


Index 529 


xvii 


Preface 


Computer organization describes how a digital computer functions. It concerns 
neither electronic circuits, modules, cabinets, cables, and the like of which a digital 
computer is made, nor does it describe interconnections of gates, flipflops, switches, 
lights, and memories by which the computer is implemented. It does describe data 
formats, number representations, instruction repertoire, configurations, micro- 
operations, sequences, timing, commands, and controls of the digital computer. In 
short, computer organization describes the functional organization and sequential 
operation of a digital computer. 

Computer organization is commonly described in block diagrams with narratives. 
Such a description gives one a general idea, but it lacks preciseness and needed 
depth. This book uses the Computer Design Language (CDL), a highly descriptive 
language, in describing the computer elements and their sequential operations. 
Description by this language is concise, descriptive, and precise. It also makes use 
of sequence charts to describe algorithms. The sequence chart allows the descrip- 
tion of parallel operations, simultaneous sequences, and independent loops. Two 
versions, the procedural and the nonprocedural, are both used here. This book 
amply shows the need of a language such as the CDL to describe complex algorithms 
implemented in hardware. 

This book is written for seniors and first-year graduate students, while the 
author’s earlier book, Introduction to Computer Organization, was written for sopho- 
mores and juniors. Since no background in electronics is needed, it suits students in 
mathematics or business as well as students in electrical engineering or computer 
science. Because it uses the CDL, this book presents a great amount of details and 
case studies on internal organizations and algorithms of digital computers; indeed, 
this constitutes a major difference from other books on computer organization. A 
CDL simulator which accepts a subset of the CDL is available. There are four 
versions, for the IBM 7094, the Univac 1108, the CDC 6600, and the IBM 5/360 
computer systems. The Univac 1108 version allows both batch processing and demand 
processing on a terminal. The first two versions were developed at the Computer 
Science Center of University of Maryland. The CDC 6600 version was provided by 
the Naval Ordinance Laboratory at White Oak, Md. The IBM S/360 version was 
contributed by the Computer Systems Group at the University of Toronto. 

There are eleven chapters. Chapter l.introduces the Computer Design Language. 
Chapter 2 illustrates the use of the CDL by describing a number of simple organiza- 
tions such as a comparator, code converters, and a bowling-score computer. Chapter 


xix 


хх PREFACE 


3 introduces the idea of a microprogrammed computer and microprogramming; a 
simple microprogrammed computer and a stored logic computer are described. 
Microprogramming is further presented in Chapters 4, 8, and 11. Chapter 4 describes 
a parallel, binary, fixed-point arithmetic unit by both sequential logic control and 
microprogram control, while Chapter 5 describes a parallel, binary, floating-point 
arithmetic unit. A parallel adder with the group and section carries similar to that 
in the IBM 7090 family of computers is described in Chapter 4. Chapter 6 describes 
serial arithmetic units, both binary and binary-coded-decimal. Chapter 7 describes 
functional organizations of random access memory, associative memory, virtual 
memory, memory addressing, memory stack, and memory loader. A memory buffer 
organization similar to that in the IBM System/360 model 85 computer is described. 
Chapter 8 presents control organizations: the sequential-logic control organization, 
the microprogram control organization, the asynchronous organization, and the 
central control organization. With the major units of a computer organization 
presented and illustrated, a complete computer organization is then described in 
Chapter 9. Because of the popularity of the IBM System/360 family of computers, 
model 40 of the family is chosen and described. The channel organization of the 
model 40 is described in Chapter 10. The idea of implementing software by micro- 
programming is introduced in Chapter 11 by an example. This example shows the 
microprogramming of the translation of a relocatable code into an executable code. 

The author wishes to express his gratitude for permission to use material from 
publications by the ACM, the IEEE, the AFIPS, the Computer Design, and par- 
ticularly those by the IBM Corporation. The author also wishes to express his 
appreciation to the many students who assisted during the preparation of the book 
particularly O. R. Pardo for his work in Chapter 11 and to Nancy Nowell who typed 
the complete manuscript. 


YAOHAN CHU 
Chevy Chase, Maryland 


Computer 
Organization 

and 
Microprogramming 


Computer organization is commonly described by block diagrams accompanied 
by narratives. Such a description can convey general ideas, but it lacks precise- 
ness and depth which enables one to understand it thoroughly. Therefore, it is 
greatly desirable that a higher-order language be developed to describe computer 
organization. 

A number of such higher-order languages have been reported in recent years 
(2-18). Among them is the Computer Design Language, or simply CDL (7), which 
has been developed to describe the functional organization, algorithms, and sequen- 
tial operations of a digital computer. This language is highly descriptive when 
identifying computer elements such as registers, decoders, switches, lights, memo- 
ries, and terminals. It is precise in describing elements, algorithms, and operations ; 
and is not only highly expressive at the bit level, word level, and bit-array level, 
but also expressive with timing signals and control commands. Computer Design 
Language can also describe serial and parallel transfers and parallel operations. 
Moreover, no background in electronics is required for a person to understand and 
use this language. 

The first chapter introduces the Computer Design Language which will be 
used throughout this book. It includes a description of the elements of a simple 
stored-program digital computer, its micro-operations and sequencing, and 
finally the complete description. 


Computer Design Language 


1.1 Declaration Statements 


This digital computer is a stored-program computer containing six registers, one 
random-access memory, three switches, one light, one decoder, and a clock. These 
computer elements are described by declaration statements which not only identify 
the computer elements, but also give a symbolic name to each of the elements and, 
if necessary, specify their functions. These elements are described below. 


1.1.1 Registers Statements 


The six registers of this stored-program computer аге R(0—23), А(0-23), С(0-14), 
D(0-14), Е(0-5), and С. As indicated by the subscripts, both registers R and A have 
24 bits, each bit of the register identified by the subscript. The leftmost bit is R(0) 
or A(0), and the rightmost bit R(23) or A(23). Register R is the buffer register of the 
memory, and register А is the accumulator where addition, subtraction, or shifting 
of an operand is carried out. Registers C and D both have 15 bits. The former is the 
address register of the memory, and the latter is the program register (or the next- 
instruction-address register). Register G which is shown with no subscript is a single- 
bit register. It controls the operate or stop condition of the computer. Registers are 
specified by register statements, such as the following statement which describes the 
above registers: 


Register, R(0-23), А(0-23), С(0-14), D(0-14), Е(0-5), С (14) 


For better readability, comments may be inserted into а register statement. For 
example, statement (1.1) may be rewritten as 


Register, R(0—23), $buffer register 
А(0-23), $accumulator 
C(0—14), Saddress register 


1.2 
D(0-14), $program register E 
F(0-5), $control register 
G, $start-stop control register 


The phrases beginning with dollar signs are comments. 


2 


Sec. 1.1 Declaration Statements 3 


A subregister is merely a part of a register which contains bits with a special 
meaning. Such a part can be given a symbolically subscripted name by a subregister 
statement. For instance, the following subregister statement, 


Register, Е(0-23) 
Subregister, R(OP)—R(0-5), R(ADDR)=R(0-23) 


declares that the leftmost six bits of register R be given a symbolically subscripted 
name R(OP), and the rightmost 15 bits a symbolically subscripted name R(ADDR). 
Subregisters R(OP) and R(ADDR) contain the op-code part and the address-part 
of an instruction, respectively. 

Sometimes two or more registers or subregisters are cascaded into one register 
to perform a special function. This is called a cascaded register or simply a casregister, 
which can in turn be described by a casregister statement. 

For example, the following casregister statement, 


Register, А(1-5), PB, 
Casregister, D(1-6)—A-PB 


declares a casregister D which is register A cascaded by register PB. This casregister 
statement is not required if no new name is needed. As another example, the follow- 
ing casregister statement, 


Register, А(0-23), Q(0—23) 
Subregister, О(М)--О(1-23) (1.3) 
Casregister, АО(0-46)=А-О(М) 


declares a casregister AQ which is register A cascaded by subregister Q(M). It should 
be noted that a casregister is not a newly created register, but is a register cascaded 
from the registers that have been declared. 

There is a need to describe a number of identical registers which form an array 
of flipflops (i.e., single-bit registers). An example is the use of four identical registers 
А1(0-8), A2(0-8), A3(0-8) and A4(0-8) to store an eight-digit decimal number. 
Instead of declaring them as four registers, it is more convenient to declare them as 
an array register by an array-register statement as in (1.4). 


Array-register, А(1-4, 0-8), Q(1-4, 1-8) (1.4) 


Statement (1.4) describes an array of flipflops called A in four rows by nine columns 
and another called Q in four rows by eight columns. Each 4-bit column stores a binary 
coded decimal digit (to be described in Chapter 6). Each row or each column of the 
array can be specified by omitting one subscript but retaining the comma. For exam- 
ple, the second row of array A can be denoted by subregister A(2,) and the eighth 
column of array Q by subregister Q(,8). 


4 Chap. 1 COMPUTER DESIGN LANGUAGE 


Instead of a casregister, there can be an array-casregister. For example, the 
cascade of the above array-registers A and Q can be defined by the following array- 
casregister statement, 


Array-casregister, АО(1-4, 0-16)=A-Q, (1.5) 


Furthermore, the basic operators to be presented subsequently can also be applied to 
the array-register and array-casregister. 


1.1.2 Memory Statements 


A stored-program computer requires a random-access memory which is com- 
posed of a large number of registers called memory locations. Each location is asso- 
ciated with a memory address, and each memory is associated with an address register 
and a buffer register. If a location of the memory is to be accessed, the memory 
address of the location must be placed in the address register before a memory trans- 
fer occurs. A simple random-access memory may be specified by a memory statement. 
The memory statement which describes the random-access memory of the simple 
computer is: 


Register, C(0-14) $address register 


1.6 

Memory, М(С)-- М(0-32767,0-23) D 

The memory statement in (1.6) declares a memory M and specifies address register 

C by denoting subscript of M(C) being C. It also states that the memory has a capacity 

of 32,768 words and a word length of 24 bits. The description of a register requires 
one subscript, but the description of a memory requires two. 


1.1.3 Switch Statements 


Manual switches are commonly used on the control panel of a digital computer 
as an external control of the computer by the operator. Thus, a switch is an input 
device having one or more positions. When it has one position, it generates a pulse 
when it is turned on. When it has more than one position, it remains at the position 
to which it is turned. Thus, a switch is also a storage element. 

Switches may be specified by a switch statement. There are three switches in the 
simple digital computer. These switches may be described by the following switch 
Statement: 


Switch, POWER(ON), 
START(ON), (1.7) 
STOP(ON), 


Statement (1.7) declares three single-position switches are with the same position name 


Sec. 1.1 Declaration Statements 5 


ON. Switch POWER initializes computer operation, and switches START and STOP 
start and stop computer operations, respectively. A switch statement may declare 
switches with one set of subscripts to describe a row of switches or switches with two 
sets of subscripts to describe an array of switches. 


1.1.4 Light Statements 


Lights are commonly installed on the control panel of a computer to indicate 
the status of the computer to the operator. Thus, a light is an output device having 
one or more conditions. If it has only one position, it generates a flash when it is 
turned on. If it has two or more conditions, it remains at the light condition to which 
it is turned. Thus, a light is also a storage element. 

Lights may be specified by a light statement. The following is the statement 
which describes the light indicating overflow in the computer: 


Light, LTOV (ON,OFF) (1.8) 


This light LTOV has two positions, ON and OFF. It is turned to the ON position 
when an additional overflow occurs. A light statement declaring lights with one or 
two subscripts in the light names may describe a row or an array of lights. 


1.1.5 Terminal Statements 


A logic network is a group of interconnected logical circuits with a particular 
function but with no storage element. It is also known as a combinatorial circuit. 
For example, a parallel adder which adds two unsigned binary numbers in parallel 
is a logic network. 

A logic network is described by a terminal statement. For instance, the parallel 
adder of the simple digital computer is now described by a terminal statement shown 
below: 


Comment, description of a parallel adder, (1.9) 
Register, А(0-23), Е(0-23) 
Terminal, C(23)=0, 
C(0-22)=A(1-23)*R(I-23)+ R(1-23)«C(1-23) + C(1-23)«A (1-23), 
SUM (0-23)=А(0-23)В(0-23)ФС(0-23), 
In the above statements, register À and R are the inputs to the parallel adder and 
SUM are the output terminals. Terminals C(0-22) are the carry outputs from each 
stage of the parallel adder. Terminal C(23) is assigned the value of 0. Notice that the 


parallel adder described in statement (1.9) adds the unsigned binary numbers in 
registers A and R, but it gives no indication when an addition overflow occurs. 


6 Chap. 1 COMPUTER DESIGN LANGUAGE 


A decoder is another example of a logic network. It translates each value of the 
contents of a register to one and only one of the outputs. For a register of n bits, the 
decoder attached to the register can have as many as 2” outputs. Therefore, it is not 
convenient to describe a decoder with a large number of outputs by a terminal state- 
ment. Furthermore, the decoder is a frequently used computer element. For these 
reasons, decoders are specified by a decoder statement. The following is an example 
of a decoder statement which describes the decoder in the control unit of the simple 
computer: 


Register, Е(0—5) 


(1.10) 
Decoder, К(0-16)=Е 


This decoder statement declares that K(0), . . . K(16) are the output terminals of the 
decoder attached to register F. Since register F has five bits, the decoder may have 
as many as 32 output terminals. However, the subscripts (0—16) indicate that only 
the first 17 output terminals are required. 


1.1.6 Clock Statement 

A clock is an electronic device which generates a sequence or sequences of pulses 
to initiate the digital action of the logic circuits in a computer. The time interval 
between two adjacent two-clock pulses 15 called the c/ock period. A clock is described 
by a clock statement. The clock statement which describes the three-phase clock of 
the computer is, 

Clock, Р(1-3) (1.11) 
Statement (1.11) declares that the name of the clock pulses be P and the names of 


the three phases be P(1), P(2), and P(3). The clock pulses in the three phases are 
equally spaced and overlapped as shown in Fig. 1.1. The adjacent pulses with one 


P(1) 
Time 


P(2) 


P(3) 


Fig. 1.1 Pulses of the three-phase clock 


Sec. 1.2 Micro-statements 7 


pulse from each phase form a cycle, called the clock cycle. The three pulses in a clock 
cycle of the above clock can be used to control a three-step sequence. 


1.2 Micro-statements 


А micro-operation is an elementary, functional operation physically built in a 
digital computer. It is described by a micro-statement. The operators, expressions, 
and transfers which constitute a micro-statement are described below. 


1.2.1 Basic Operators 


An operator is a symbol which represents the function to be carried out by a 
logic network during one clock period. Certain operators that are frequently used are 
called basic operators. The basic operators to be employed in this book are shown in 
Table 1.1. The functions they perform are explained below. 

Basic operators are classified as logical, functional, and arithmetical. The five 
logical operators, NOT, OR, AND, EXOR, and COIN, are those used in Boolean 
algebra. However, each of these logical operators may operate on a single bit or 
individually on multiple bits. There are six functional operators (shl, shr, cil, cir, 
countup or inc, countdn or dec). The shift operators (shl, shr) shift the contents of a 


TABLE 1.1 Basic Operators 


OPERATORS SYMBOLS EXPLANATIONS 


Logical operators! 
NOT И Logical not operation 


OR + Logical or operation 

AND * Logical and operation 

EXOR Ф Logical exclusive-or operation 

COIN (9 Logical coincidence operation 
Functional operators 

Shift left Shl Shift left one or more bits! 

Shift right Shr Shift right one or more bits: 

Circulate left Cil Circulate left one or more bits 

Circulate right Cir Circulate right one or more bits 

Count up Countup or inc Increment count by one 

Count down Countdn or dec decrement count by one 
Arithmetical operators 

Addition Add Add one unsigned binary number 

to another 
Subtraction Sub Subtract one unsigned binary number 


from another 
à E 4141 ДЫЫ——ЫЫЫЙУ——ыЫы———— 
+Logical operators operate on a single bit or individually on multiple bits. | 
tZeros are inserted at the right end for the left shift and at the left end for the right shift. 


8 Chap. 1 COMPUTER DESIGN LANGUAGE 


register one or more bits to the left on right with one or more zeros inserted at the 
right or left end, respectively. The circulate operators (cil, cir) are shift operators 
except that one end of the register is connected with the other end. The count opera- 
tors (countup or inc, countdn or dec) increment or decrement the contents of a 
register by one. There are two arithmetical operators which add or subtract one un- 
signed binary number to or from another unsigned binary number, respectively. 

An operator can be unary, binary, or both. A unary operator needs one operand, 
while a binary operator needs two. Operators NOT, countup, and countdn are unary 
operators. Operators +, ж, CD, , add (5), and sub are binary operators. The other opera- 
tors in Table 1.1 can be either unary or binary. The examples of using these operators 
will be shown subsequently. 


1.2.2 Expressions 


An expression is a formatted combination of constant, operator, and/or operand 
to describe a part of a micro-statement. It represents the function which a micro- 
operation performs before the transfer occurred in the micro-operation. An expression 
can be an operand such as A and OV, a unary operator with one operand such as 
shl A and countup D, or a binary operator with two operands such as A add R. It 
can also be a binary, an octal, or a decimal constant. A convention concerning the 
use of a subscript for the constant is established as follows: A decimal constant is not 
subscripted. An octal constant is always subscripted. A binary constant is subscripted if 
it is not apparent. The additional examples of expressions are: 


xX’ 
shl YYADDR) 

X Ce Y (1.12) 
3 shl X 

5 cir Y 


where X and Y are registers and Y(ADDR) is a subregister. 


1.2.3 Micro-operations 


A micro-operation transfers a constant or the contents of one or two registers to 
another register; an operation is performed during the transfer. This transfer between 
one register and another is the unique characteristic of a micro-operation. 

As mentioned in Section 1.2, a micro-operation is described by a micro-statement 
which consists of two parts: one of them describes the operands and the operator of 
the micro-operations, and the other describes the transfer and the result of the micro- 
operation. The former is described by an expression, and the latter by an arrowhead 
for the transfer and the name of the register where the result is stored. Consider the 


Sec. 1.2 Micro-statements 9 


micro-operation which transfers the contents of register R to register A. The micro- 
statement which describes this micro-operation is 


А-В (1.13) 


Note that in this micro-statement the arrowhead which symbolizes the transfer and 
the expression (at the right side of the arrowhead) is merely the name of a register. 

It is important to note that the transfer in a micro-operation is not physically 
instantaneous; a finite amount of time is required. A clock period larger than this 
amount of time is usually chosen so that the micro-operation can be completed within 
the clock period. One should always bear in mind that the arrowhead symbol in a 
micro-statement involves a delay in the transfer of a micro-operation. 

The delay in a transfer between the registers is sometimes beneficial if it is properly 
implemented. For example, the following count-up micro-operation, 


D<countup D (1.14) 


as shown in Fig. 1.2, allows the outputs of register D to go through a count logic 
network and then transfers the outputs of the logic network back to register A. In 
order to make this micro-operation feasible, a delay must exist when the signals 
travel the path from the outputs of the register to the inputs. This delay can be a 
parasitic delay of the logic circuits, or more often a delay element inserted through 
the path. Figure 1.2(a) and 1.2(b) illustrate a delay inserted at the output side and 
the input side of register D, respectively. 

Micro-operations may be classified into set-constant micro-operations, transfer- 
only micro-operations, unary micro-operations, and binary micro-operations. Exam- 


Count logic 
network 


Count logic 
network 


Control 
signal 


Control 
signal 


(a) (b) 


Fig. 1.2 Delays in the path of а register transfer: (a) delays at 
the output side of the register; (b) delays at the input 
side of the register 


10 Chap.10 COMPUTER DESIGN LANGUAGE 


ples of these four types of micro-operations are shown in Table 1.3. Notice that the 
constant in a micro-operation is either octal or binary. 


1.2.4 Conditional Micro-statements 


When a micro-operation (or micro-operations) is to be carried out on certain 
conditions, it is described by a conditional micro-statement. There are two types of 
conditional micro-statement, IF-THEN and IF-THEN-ELSE, as shown in the follow- 
ing: 


Register G, Е(0-5), C(0-14), D(0-14), 
IF (G—0) THEN (Е<-10), (1.15) 
IF (651) THEN (C—0, D0) ELSE (Ее-11), 


The conditions (G=0) ог (G1) in a conditional micro-statement are expressed by 
the symbols = (equal) ог == (not equal). When the condition is true, the micro- 
statements enclosed by the words THEN and ELSE (if ELSE exists) are executed 
simultaneously (1.е., parallel operation). Thus, the order in which this micro-state- 
ment is written is of no significance. When the condition is not true, the micro- 
statement (or micro-statements) following the word ELSE is executed; nothing 
happens if there is no micro-statement following word ELSE. 

The micro-statement in a conditional micro-statement can be another micro- 
statement. For example, in the following conditional micro-statement, 


Register A(2-1) 
IF (А(1)=0) THEN (A(1)—1) (1.16) 
ELSE (IF (A(2)=0) THEN (A(2).—1,A(1)—0) ELSE (A-—0)) 
the micro-statement following ELSE 1$ another conditional micro-statement. 
The conditions in (1.16) are expressed for a single bit, but can also be expressed 
for multiple bits. For example, in the following conditional micro-statement, 
Register, C(5-0) $counter 


(1.17) 
IF (C=77,) THEN (............... ) 


the condition to test counter C to be equal to 77, 1$ a test of six bits. 

The value of a condition is logical because the condition is either 1 (true) or 
0 (false). Therefore, logical operators can be used to form a conditional expression. 
For example, if the condition is bit X(1) is 1 and bit Y(2) is 0 and switch START is 
at the ON position, then we have, 


IF(X(1)2)«(YQ)-0)«START—ON)) THEN (.....5.......) (1.18) 


With the logical operators, a conditional expression can be more complex than 
the one shown above. It may become lengthy and difficult to read. However, the 


Sec. 1.2 Micro-statements 11 


conditional expression can be simplified if we convert each condition into a Boolean 
variable. For example, if we substitute 

Х(1)=1 or X(1)40 by Х(1), 

Y(2)20 or Y(2z1 by Ү(2), 


and 
START—ON by START(ON), 


the above conditional micro-statement now becomes a concise and readable state- 
ment as shown below: 


IF (X(I)*Y(2START(ON)) THEN (............... ) (1.19) 
This simplification is not practical in such conditions as, 


X(1)4Y(2), 
С--77, (1.20) 
F(0-2)=R(4-6) 


where a single-bit is compared with another single-bit or where a multiple-bit is com- 
pared with a multiple-bit or a constant. 


1.2.5 Special Operators 


The basic operators in Table 1.1 are not sufficient to describe all micro-opera- 
tions. When such an occasion arises, a special operator is defined. Since an operator 
represents a logic network, the definition of a special operator specifies a special logic 
network. To define a special operator, a symbolic name is chosen which should be 
different from the names of all operators. 

A special operator is defined by CDL statements. For example, consider a two- 
bit counter which counts according to the following sequence, 00, 10, 11, 01, 00, .... 
Let the name of this special counter be ct2. The definition of operator ct2 is, 


Operator, D-«—ct2 D(0-1), (1.21) 
/begin/ IF (D=00) THEN (0<—10), 

IF (D=10) THEN (D—11), 

ТЕ (D=11) THEN (D—01), 

IF (D=01) THEN (D-—00), 


end of operator 


In (1.21), Р is the dummy variable which, in this case, represents both inputs and 


12 Chap. 1 COMPUTER DESIGN LANGUAGE 


outputs. The definition proper is enclosed by the label /begin/ and by the “end of 
operator” statement; it consists of four conditional micro-statements. 

The definition of the logic network for a special operator can also be expressed 
by Boolean equations. For instance, consider a parallel adder which adds two un- 
signed binary numbers and indicates an overflow when it occurs. Let this special 
operator be called addov. The definition of operator addov is, 


Operator, ОУ-А(0-23)--Х(0-23) addov Ү(0-23) (1.22) 
Terminal, С(23)--0, 
С(0-22)--Х(1-23)жҮ(1-23)--Ү(1-23)«С(1-23) -ЕС(1-23)«Х(1-23), 


Jbegin/  A(0-23)=X(0-23)@Y(0-23)HC(0-23), 
OV=X(0)*¥(0)-+Y(0)*C(0)+C(0)*X(0), 


end of operator 


In (1.22), variables X and Y represent the inputs and OV and A represent the outputs 
of the logic network. A terminal statement is used to define internal variables C(0—23) 
which are the carries of the parallel adder. 


1.3 Sequencing 


А micro-operation is a single operation and it takes several micro-operations to 
perform a useful function. To make a large number of micro-operations perform 
many functions in a digital computer requires that these micro-operations be con- 
trolled, organized, and sequenced. The label, the execution statement, and the control 
sequence are now introduced in order to accomplish this objective. 


1.3.1 Execution Statements 


A micro-statement does not specify when a micro-operation occurs. А micro- 
operation is initiated by a control signal (see Fig. 1.2) which is represented by a label. 
Since the control signal is binary, the label is a logical variable. When the value of 
the label is true, the micro-statements, which represent the micro-operations con- 
trolled by the control signal, are executed; otherwise, nothing happens to these 
micro-statements. 

An execution statement consists of a label and one or more micro-statements 
related to the label. It may be regarded as a controlled micro-statement or a group 
of controlled micro-statements. For example, let N be the single-bit register which 
issues a command signal when it is 1, and P be the pulse of a single-phase clock. 
Let the control signal be formed by the logical-and operation of N and P. If register 
N initially contains a 1, the following statements illustrate the execution of a con- 
trolled count-up micro-operation: 


Sec. 1.3 Sequencing 13 


Register, D(3-0), М, 
Clock, P, (1.23) 
/Nx«P/ D<countup D, 


The last statement in (1.23) is an execution statement which has the label N«P enclosed 
by slashes. Since the initial value of register N is one, register D will be incremented 
by one when the next clock occurs, because at that time the logical value of label 
NxP is one. Since register М does not change, the value of the label again becomes 
one each time a clock appears, and register D will be incremented by one each time 
a clock appears. However, if the execution statement is replaced by the following, 


/Nx«P/ D<countup D, N<0, (1.24) 


register D will be incremented by one only once, because the contents of register N 
is changed to 0 when register D is being incremented and the logical value of label 
N+P is no longer one. Notice that statement (1.24) is an execution statement in which 
two micro-operations are controlled by one control signal. 


1.3.2 Sequencing by a Multiple-phase Clock 


The fetch sequence of the simple digital computer consists of three steps as shown 
in the following: 


step 1, C-D, 
2, R—M(C) D<countup D (1.25) 
3, F—R(OP) C<-R(ADDR) 


The first step transfers the contents of register D to register C. The second step 
transfers the memory word located at the address in register C to register R and 
increments register D by one. The third step transfers the op-code part of register R 
to register F and the address part of register R to register C. A contro] sequence is 
required to sequence these three steps. A multiple-phase clock generates signals for 
a control sequence. If a three-phase clock is used to sequence these three steps, we 
have the fetch sequence, 


Clock, Р(1-3), 
/FETCH«P(1/ CD, 
/ЕЕТСН*Р(2)/ R«-—M(C) Decountup D, (1.26) 
JFETCH«P(3/ F--R(OP) C<R(ADDR), 
End 


where word FETCH is a command signal and the end statement is the end of a 
sequence. 


14 Chap. 1 COMPUTER DESIGN LANGUAGE 


1.3.3 Sequencing by a Control Register 


Instead of a multiple-phase clock, a control sequence can be formed by using 
a control register with a decoder attached to it. As an example, let control register be 
D, clock be P, and decoder terminals be K's. The control sequence of four steps is 
described by the following statements: 


Register, D(0-1), 

Decoder, K(0-3)—D, 

Clock, Р, 

/K(0)«P/ D<countup D, (1.27) 
/K(1)*P/ D<countup D, 

/K(2)*P/ D<countup D, 

/K(3)*P/ D<countup D, 


End 


The control sequence described in statements (1.26) is generated by the clock, 
but the control sequence above is advanced by the countup-D micro-operation. 
The countup micro-operation is used because the sequence merely increments the 
contents of the control register D each time by one. If a control sequence requires a 
sequence of arbitrary values in register D, constant-setting micro-operations have 
to be used. For example, consider the counting sequence described in statements in 
(1.21) which can be implemented by the sequence described as follows: 


Register, D(0-1), 

Decoder, K(0-3)—D, 

Clock, P, 

К(ОжР/ D<2, 

Жазық 049 
[К (2)*Р/ D<3, 

/K(3«P/ D-1, 

End 


1.3.4 Sequencing by a Multiple-control 
Sequence 


A sequence is a series of execution statements designed to perform a specific 
function. This series is controlled by a control sequence. Since the function can be 
one required by an instruction of a digital computer, a multiple-control sequence is 


Sec. 1.3 Sequencing 15 


required when a set of instructions of a digital computer is implemented. One way to 
generate a multiple-control sequence is to make use of the combination of a control 
register and a multiple-phase clock. 


For example, consider the generation of four three-step control sequences. The 
following statements describe the generation of these control sequences. 

Register, F(0-1) 

Decoder, К(0-3)=Е 

Clock, Р(1-3) 

Comment, here begins the sequence when F is 0. 

/K(0)«P(1)/ 

/K(0)*P(2)/ 

/K(0)«P(3/ Fe—countup Е 

Comment, here begins the sequence when F is 1. 

/K(1)*P(1)/ 

/K(1)*P(2)/ 

/К(1)*Р(3)/ F-—countup Е 

Comment, here begins the sequence when F is 2. 

/KQ)«P(1)/ 

/К(2)*Р(2)/ 

/К(2)*Р(3)/ F<—countup Е 

Comment, here begins the sequence when F is 3. 

/K(3)*P(1)/ 

/K(3)*P(2)/ 

/K(3)«P(3/ Fe-countup Е 

End 


(1.29) 


In the above statements, the line beginning with comment is called a comment 
statement. Like the previously mentioned comment phrases, a comment statement 
can be inserted anywhere to improve the readability of the description. Each of the 
above four three-step sequences begins with a comment statement. The three steps 
in each of these sequences are advanced by the three-phases of the clock. On the 
other hand, the four sequences are advanced by the four incrementing-counter-F 
micro-operations because they are advanced according to the contents in register F 
by following the ascending order of two-bit binary numbers (00, 01, 10, 11, 00, . . .). 
If these sequences are advanced in an arbitrary order, constant-setting micro-opera- 
tions are used instead. The use of a multiple-phase clock, however, makes the number 
of steps of these sequences indentical. The multiple control sequences in the simple 


16 


Chap. 1 


COMPUTER DESIGN LANGUAGE 


digital computer are similar to those described in statements (1.29) except that con- 
stant-setting micro-operations are used as will be shown. 


1.4 Description of a Stored-program Computer 


After introducing the elements, micro-operations, and sequencing of the simple 
digital computer, the complete computer will now be described. The description 
consists of the configuration, the formats, the instruction set, the. sequence chart, 
and the statements. Another version of the computer will be presented to illustrate 
the use of special operators. 


1.4.1 Configuration 


The configuration of the computer is shown in the block diagram in Fig. 1.3. 
It can be described by the following statements: 


Register, 


Subregister, 


Memory, 
Decoder, 


Switch, 


Terminal, 


R(0-23), 
А(0-23), 

С(0-14), 

D(0-14), 

F(0-5), 

G, 

R(OP)=R(0-5), 
R(D—R(6) 
R(X)=R(7-8), 
R(ADDR)=R(9-23), 
М(С)--М(0-32767,0-23), 
K(0-9)=F, 
POWER(ON), 
START(ON), 
STOP(ON), 
ADD=K(0), 
SUB=K(1), 
JOM=K(2), 
STO=K(3), 
ІМР--К(4), 
SHR=K(5), 


$buffer register 
$accumulator 

$address register 

$program counter 

$control register 

$start-stop control register 
$op-code part of register R 
Sindirect addressing bit of R 
$indexing bit of R 

$address part of R 


$initialize computer operation 
$start computer operation 
$stop computer operation 
$add sequence command 
$subtract sequence command 
$jump-on-minus command 
$store-accumulator command 
$jump command 


$shift right one-bit command 


(1.30) 


Sec. 1.4 Description of a Stored-program Computer 17 


CIL=K(6), $circulate left shift one-bit command 
CLA=K(7), $clear accumulator command 
STP=K(8), $stop command 
FETCH=K(9), $fetch sequence command 

Clock, Р(1-3), $three-phase clock 


The configuration above contains a memory unit, a control unit, and an arith- 
metic unit. The memory unit consists of memory M, buffer register R, and address 


Memory 
M(0-32767, 0-23) 


Control 
network 


signals 


START 


ON ON ON 


Fig. 1.3 Configuration of a simple, stored-program computer 


18 Chap. 1 COMPUTER DESIGN LANGUAGE 


register C. The control unit consists of control register F and its associated decoder, 
program counter D, start-stop control register G, clock P, and switches POWER, 
START, and STOP. The terminals К(0-9) of the decoder give the command signals. 
When register G is 1, the computer is in the go state; otherwise, it is in the wait state. 
When turned on, switch POWER initializes computer operation, switch START 
begins it, and switch STOP terminates it. The arithmetic unit consists of register A, 
register R, and a 24-bit parallel adder-subtracter which adds the number in register 
R to that in register A, and subtracts the number in register R from that in register 
A. This addition and subtraction deals with the numbers as if they were unsigned 
binary numbers. Overflow is not indicated, but ignored. 

In statements in (1.30), four subregisters of register R are declared, each repre- 
senting one of the four fields of the instruction. Since indexing and indirect addressing 
are to be discussed later in the book, these two fields are not included in this chapter. 
Also in statement (1.30), each terminal from the decoder is given a symbolic name to 
indicate the sequence which the command signal on that terminal commands. 


1.4.2 Formats 


Each word of the computer contains 24 bits. As an instruction, the word consists 
of a six-bit op-code field, a one-bit indirect-addressing field, a two-bit indexing field, 
and a 15-bit address field. As a number, the word consists of 23 number bits plus a 
sign bit. The number bits represent a fractional binary number with the binary point 
located between the sign bit and the most significant number bit. The numbers are 
in the signed two’s complement representation(1). The instruction and number for- 
mats are shown in Fig. 1.4. 


0 56 78 9 23 


(а) 


1 23 
Sign 


(b) 


Fig. 1.4 Word formats: (a) instruction format; (b) number 
format 


1.4.3 Instruction Set 


There are nine instructions: ADD, SUB, JOM, STO, JMP, SHR, CIL, CLA, and 
STP. These instructions and their op-codes are shown in Table 1.2, where m repre- 
sents a memory address in the address field of the instruction; this memory address 


Sec. 1.4 Description of a Stored-program Computer 19 


can be either an instruction or an operand address. Instruction SHR shifts the word 
in the accumulator one bit to the right with a zero inserted at the left end. Instruction 
CIL circularly shifts the word in the accumulator one bit to the left. And instruction 
STP stops the computer operation. These three instructions do not require address 
m. 


TABLE 1.2 The Instruction Set 


INSTRUCTION NAME SYMBOLIC CODE Op-CoDE 
Addition ADD m 00 
Subtraction SUB m 01 
Jump on minus JOM m 02 
Store STO m 03 
Jump ЛМР т 04 
Shift right SHR 05 
Circular leftshift CIL 06 
Clear add CLA m 07 
Stop STP 10 


Note: m denotes a memory address. Op-code is octal. 


Instructions ADD and SUB add and subtract, respectively, the number in the 
memory location at address m to or from that in the accumulator. Instruction CLA 
functions the same way as does instruction ADD except that the accumulator is 
cleared before the addition is performed. Instruction STO stores the number in the 
accumulator into the memory location at address m. These are all operand addresses. 

Instruction JMP takes the next instruction from the memory location at address 
m; so does instruction JOM when the number in the accumulator is negative. These 
are instruction addresses. 


1.4.4 Sequence Chart 


The computer executes one control cycle after another. During each control 
cycle, an instruction from the program stored in the memory is executed. The instruc- 
tions are thus executed one after another by following the ascending order of the 
addresses of the memory. This order of instruction execution is called the normal 
order. 

During a control cycle, the fetch sequence is carried out, followed by the execu- 
tion sequence. The fetch sequence receives the instruction from the memory, decodes 
the op-code, transfers the operand address to the address register, and forms the next 
instruction address as these micro-operations were described in statements (1.26). 
The execution sequence carries out the micro-operations specifically required by the 
instruction. There is only one fetch sequence because the micro-operations required 
by each instruction are the same; but there are nine execution sequences, one for 


20 Chap. 1 COMPUTER DESIGN LANGUAGE 


each of the nine instructions. In short, the computer executes, beginning with the 
fetch sequence, followed by an execution sequence and the fetch sequence alterna- 
tively. 

The sequential operation of the computer is described by the sequence chart 
shown in Fig. 1.5. The upper part of the chart shows the single path for the fetch 
sequence, and the lower portion demonstrates the nine parallel paths for the nine 
execution sequences. These paths converge and return to the fetch sequence. When 
the POWER switch is turned to the ON position, register G is set to 0 and the com- 
puter is in the wait state during which registers C and D are continually reset to 0. 
Notice the wait loop at the lower right-hand corner of the chart. When the START 
switch is pressed and register G is set to 1, the computer enters the go state and 
begins execution of the fetch sequence. The first instruction to be executed 15 in address 
0 because the address register has been reset to 0 during the wait state. As noted from 
the sequence chart, the computer continues execution of either the fetch sequence 
or an execution sequence until a STP instruction is encountered or the STOP switch 
is pressed; that is, in either of these two cases, the computer enters the wait loop. 
During the fetcb sequence, the contents of register G is tested. If G is 1, the execution 
continues; otherwise, the execution enters the wait loop. 

At the end of the fetch sequence, the op-code of the instruction is in register F. 
By means of the decoder, the respective command signal is activated and the particu- 
lar execution sequence now begins. Аз shown in the sequence chart, each execution 
sequence takes one or two steps. It is assumed that the reading part of the memory 
cycle occurs during clock P(2) and the writing part during clock P(3). The nine execu- 
tion sequences are: ADD, SUB, CLA, STO, JMP, JOM, SHR, CIL, and STP. The 
ADD sequence takes two steps, transferring the word from the memory to register R 
and then adding it to register A. Both the SUB and CLA sequences are similar to the 
ADD sequence, except that in the SUB sequence a subtraction is performed and in 
the CLA sequence register А is first reset to 0. The STO sequence transfers the con- 
tents of register А to register В and then stores it into the memory. The SHR and 
CIL sequences perform shift-right one-bit and circulate-left one-bit micro-operations, 
respectively. The JMP sequence transfers the address part in register R to register 
D so that this address will become the address of the next instruction. The JOM 
sequence executes like a JMP sequence when the number in register A is negative. 
Thus, a JMP sequence breaks the normal order of instruction execution and a JOM 
sequence does so conditionally. 


1.4.5 Statement Description 


By means of a multiple-control sequence, the sequence chart can be readily 
converted into a statement description. The multiple-control sequence is identical 
to the one described in statements (1.29) except that micro-operations which set 
register F to a constant are employed. The description of the simple, stored-program 
computer by statements 15: 


1e1ndujo2 шелбо.А-рэл0}$ 'ejduiis ey} jo weyo souenbesg 6%), "614 


(Yyqqv)y-9 
(90)9>3 


а dmunoosq 
(Он 


р |uaaviu-a Lr 
(наау)н-а) Е E 
МЭН ((O) V) 3I d V ЕХ 
m Р B 


ы рре vov 


21 


22 


Chap. 1 COMPUTER DESIGN LANGUAGE 


Comment, subscripts are decimal and constants are octal. 

Comment, here are start-stop micro-operations. 

/POWER(ON)/ G-—0, F-—8, $5е Е to start the stop sequence 
/[START(ON)&P(2) G-1, 


/STOP(ON)/ G0, 
Comment, here is the fetch sequence. 
/FETCH«P(1)/ C-D, IF (G=0) THEN (F-—8), 
/FETCH«P(2)/ R<«M(C), D<countup D, 
/FETCH«P(3)/ F<R(OP), C- R(ADDR), 
Comment, here is the add sequence. 
/ADD«P(2)/ R—M(C), 
/ADD«*P(3)/ А<—А add R, F<9, 
Comment, here is the sub sequence. 
/SUB*P(2)/ R—M(C), 
/SUB«P(3)/ АА sub R, F<9, (1.31) 
Comment, here is the store sequence. 
/STOxP(1) КА, 
/STO«P(3)/ M(C)<R, F<9, 
Comment, here is the clear add sequence. 
/CLA«P(2)/ R-—M(C), А<-0, 
/CLA*P(3)/ А<-А add В, F—9, 
Comment, here is the stop sequence. 
/[STP«P(1)/ G0, 
/STP«P(3)/ ТЕ (G=0) THEN (С<-0, D-—0) ELSE (Ғ<-9), 
Comment, here are the other sequences. 
/JMPxP(3)/ D<R(ADDR), F—9, 
/JOMx«P(3)/ IF (А(0)) THEN (D<-R(ADDR)), F<9, 
/SHR«P(3)/ A«—shr A, F<9, 
/CIL*P(3)/ A«cil A, F«—9, 
End 


1.4.6 Data and Control Paths 


Data and control paths are wires which interconnect computer elements. These 


are the multitudes of wires that one sees behind the back panel of a digital computer. 


Sec. 1.5 Statement Description 23 


Breakage of a single wire may cause the computer to operate erroneously. These 
wires constitute a part of the hardware which implements the micro-operations, test 
conditions, control signals, and manual control operations (see Table 1.3). 

There are seven groups in Table 1.3. The first group consists of the constant- 
setting micro-operations such as clearing register G and setting register F to octal 
12. The second group consists of shifting and incrementing micro-operations which 
involve unary operators. There are 13 micro-operations in the first group and three 


TABLE 1.3 Micro-operations, Test Conditions, Control 
Signals and Manual Control Operations of 
the Computer 


1. Set-constant micro-operations 
G<—0 
G<1 
С 0 
0<--0 
A«—0 
F—8 
Е<—9 


2. Unary micro-operations 
D«—countup D 
A«—shr A 
A«—cil A 


3. Transfer-only micro-operations 

CD 

C<—R(ADDR) 
D«—R(ADDR) 
F«—R(OP) 

ВА 

R«—M(C) 

M(C)—R 


4. Binary micro-operations 
A<A add R 
А<—А sub К 


5. Test conditions 
G=0 
G=1 
А(0)=1 


6. Control signals 
K(0)*P(2), 


К(9)*Р(3), 


7. Manual control operations 
POWER<ON, 
START<ON, 
STOP<_ON, 


Уа ее эш ESSERE 


24 Chap. 1 COMPUTER DESIGN LANGUAGE 


in the second group. Each of these micro-operations involves only one register; no 
wiring between the elements is required. 

The third group contains transfer-only micro-operations such as register transfers 
and memory transfers. The fourth group consists of micro-operations such as addition 
and subtraction. These micro-operations involve binary operators, add and sub. 
There are seven micro-operations in the third group and two in the fourth group. 
Since these micro-operations involve two or more elements, wiring among the ele- 
ments is required. If these transfers are parallel as those in Table 1.3, the amount of 
wires is proportional to the number of parallel bits for each parallel transfer. 

The fifth group is composed of test conditions. For instance, it can test whether 
С is 0 or whether the number in the accumulator is negative. 

The sixth group consists of control signals and the seventh group contains 
manual control operations. There are three micro-operations in the fifth group, 17 
control signals in the sixth group, and three manual control operations in the seventh 
group. Since the test condition, control signal, and manual control operations are 
binary, only one wire is needed for each of them. These wires may be long, however, 
since the test condition may be far away from where the micro-operation takes place. 

The data paths of the computer are indicated by those lines connecting the 
registers and the memory in Fig. 1.3. The control paths are not shown for the purpose 
of clarity. The selection of these paths restricts the type of micro-operations that are 
allowed. It is of great importance that the number of these paths be minimized, even 
sometimes at the expense of more steps. 


1.4.7 An Example of Using Special Operators 


Basic operators, add and sub, add and subtract, respectively, unsigned binary 
numbers. For addition and subtraction of the binary numbers in the signed 2’s com- 
plement representation (1), the sign bit is treated as a number bit. Therefore, these 
two basic operators can be used for binary addition and subtraction for the numbers 
in the signed 2’s complement representation. However, the definitions of basic opera- 
tors add and sub ignore the overflow. If the overflow is to be indicated, special opera- 
tors for addition and subtraction have to be defined. For example, two special 
operators addov and subov are defined. These operators replace the basic operators 
add and sub in the description of statements (1.31). As a result, the stored-program 
computer which employs special operators addov and subov can be described in the 
following statements: 


Register, R(0—23), $buffer register (1.32) 
А(0-23), $accumulator 
C(0—14), $address register 
D(0-14), $program counter 
F(0—5), $control register 
G, Sstart-stop control register 


OV, Soverfiow indicator 


Sec. 1.5 Statement Description 


Subregister, R(OP)— R(0—5), 
R(ADDR)=R(9-23), 

Memory, M(C)= М(0-32767,0-23), 

Decoder, K(0-19)—F, 

Switch, POWER(ON), 
START(ON), 
STOP(ON), 

Light, LTOV(ON,OFF), 

Terminal, ADD=K(0), 
SUB=K(I), 
JOM=K(2), 
STO=K(3), 
JMP=K(4), 
SHR=K(5), 
CIL=K(6), 


CLA=K(7), 

STP=K(8), 

FETCH=K(9) 
Clock, Р(1-3), 


Comment, here аге start-stop micro-operations. 


25 


Sinitialize computer operation 
$start computer operation 
$stop computer operation 
Soverflow light 

$add sequence command 
$subtract sequence command 
$jump-on-minus command 
$store-accumulator command 
$jump command 

$shift right one-bit command 


$circulate left shift one-bit 
command 


$clear accumulator command 
$stop command 


$fetch sequence command 


/POWER(ON)/ G0, F—8, OV—0, LTOV—OFF, 


/START(ON)*P(2)/ Gel, 


IF (OV—1) THEN (LTOV—ON, G<0, F-—8) ELSE 


/STOP(ON)/ G0, 

Comment, here is the fetch sequence. 

/FETCHsP(1)/ IF (G=0) THEN (F-—8), 
(C-D), 

[ЕЕТСН*Р(2)/ R«—M(C), D<countup D, 

/FETCH*P(3)/ F-—R(OP), C—R(ADDR), 

Comment, here is the add sequence. 

/ADD«P(2)/ R—M(C), 

/ADDxP@G)/ OV-A-—A addov В, F<9, 


Comment, here is the sub sequence. 


$memory word available at 
P(2) 


26 


Chap. 1 COMPUTER DESIGN LANGUAGE 


/SUB*P(2)/ В—М(С), 
/SUB«P(3)/ OV—A.—A subov В, F<9, 
Comment, here is the store sequence. 
/STOxP(1)/ ВА, 
/5ТОхР(3)/ М(С)< В, F<9, $memory word stored at P(3) 
Comment, here is the clear add sequence. 
/CLAxP(2)/ Е<-М(С), A—0, 
/CLA«xP(3)/ ОУ-А--А addov В, F—9, 
Comment, here is the stop sequence. 
/STP«P(1)/ G0, 
/STP«P(3)/ IF (G=0) THEN (C—0, D0) ELSE (F<9), 
Comment, here are the other sequences, 
/[JMPxP(3)/ D-—R(ADDR), F<9, 
/JOM«P(3)/ ТЕ (А(0)) THEN (D<R(ADDR)), F< 9, 
/SHR«P(3)/ A«shr A, F—9, 
/CIL«P(3)/ Acil А, F<9, 
End 
Operator, Е(ОУ,0-23)-- А(0-23) addov B(0—23), 


Comment, both numbers in registers А and B are in the signed 2's complement 
representation. Add the contents of register B to register A. OV is 
set to 1 in case of overflow. 


Terminal, C(23) —0, 
C(0—2) = А(1-23)«В(1-23)--В(1-23)«С(1-23)--С(1-23)ж 
A(1—23), 
SUM(0-23) — А(0-23)ФВ(0-23)ФС(0-23), 
/begin/ R(0-23)=SUM(0-23), 


R(OV)=A(0)*B(O)*C(0)’+ A(0)'«B(0)'C(0), 
End of operator 
Operator, Е(ОУ,0-23)-- А(0-23) subov B(0—23), 
Comment, both numbers in registers A and В are in the signed 2’s complement 
representation. Subtract the contents of register B from register A. 
OV is set to І in case of overflow. 
Terminal, C(23)—1, 


C(0-22)= A(0-23)«B(1—23) + B(1-23)'«C(1-23)-- C(1-23) 
жА(1-23), 


References 27 


SUM(0-23)=A(0-23)@B(0~23)' @C(0-23), 
/begin/ Е(0-23)--51/М(0-23), 
R(OV)— А(0)«В(0) «С(0) +A(0)’+B(0)*C(0), 


End of operator 


In the statements in (1.32), register OV and light LTOV are added to indicate 
the overflow when it occurs. These register and light are also initialized by switch 
POWER. Furthermore, the condition of register OV is tested during the first step of 
the fetch sequence while register G is being tested. When register OV is 1, the com- 
puter is to enter the wait state. Operator addov changes the add micro-statement in 
the ADD and CLA sequences, while operator subov changes the sub micro-statement 
in the SUB sequence. All the other statements remain the same. 

The definitions of the two special operators follow the description of the com- 
puter. The inputs are two 24-bits represented by A and B, while the outputs are one 
25-bit represented by R. Variables A, B, and R are all dummy variables. Variables 
SUM(0-23) and С(0-23) represent respectively the sum bits and carry bits of a parallel 
adder with the gated carry (1) in the definition of operator addov. They also represent 
the difference bits and borrow bits in the definition of operator subov. During addi- 
tion, overflow occurs if both signs are positive and C(0) is 1 or if both signs are nega- 
tive and C(0) is 0. During subtraction, overflow occurs when both signs are different. 
Overflow occurs when C(0) and B(0) are both 1 or both 0. 


References 


1. Сно, Y., Digital Computer Design Fundamentals. New York: McGraw-Hill Book 
Company, 1962. 

2. IvERSON, К. E., А Programming Language. New York: John Wiley & Sons, Inc., 1962. 

3. $сновв, H., “A Register Transfer Language to Describe Digital Systems," Technical 
Report No. 30, Digital Systems Laboratory, Department of Electrical Engineering, 
Princeton University, September, 1962. 

4. FALKOFF, А. D. and Iverson, К. E., “А Formal Description of System/360," IBM 
Systems Journal, 3, No. 3, 1964. 

5. SCHLAEPPI, H. P., “A Formal Language for Describing Machine Logic, Timing and 
Sequencing (LOTIS),” JEEE Trans. on Electronic Computers, August, 1964, pp. 439-448. 

6. MULLERY, A. P., “А Procedure Oriented Machine Language,” JEEE Trans. on Electronic 
Computers, August, 1964, pp. 449—455. 

7. Сно, Y., “Ап Algol-like Computer Design Language," Comm. of АСМ, October, 1965, 
pp. 607-615. 

8. PaRNAS, D. L., “А Language for Describing the Function of Synchronous Systems," 
Comm. of ACM, February, 1966, pp. 72-76. 


28 


Chap. 1 COMPUTER DESIGN LANGUAGE 


. WILBER, J. A., “А Language for Describing Digital Computer," Report No. 197, 


Department of Computer Science, University of Illinois, February 15, 1966. 


. Giese, A., “Hargol—A Hardware Oriented Algol Language,” Internal Report No. VAS, 


August, 1966, A/S Regnecentralen, Copenhagen, Denmark. 


. МЕТРЕ, С. and Sesau, S., “А Proposal for a Computer Compiler,” Proc. of the SFCC 


Conference, 1966, pp. 253-263: > 


. ZUCKER, М. S., “LOCS: An EDP Machines Logic and Control Simulator,” International 


Convention Record, Part 3, 1965, pp. 28-50. 


. CLARK, W. A., “Macromodular Computer Systems,” AFIPS Spring Joint Computer 


Conference, 30, pp. 335-336, 1967. 


. ORNSTEIN, S. M., Stuck, М. J., and CLARK W. A., “A Functional Description of 


Macromodules,” AFIPS Spring Joint Computer Conference, 30, pp. 337-364, 1967. 


. FRIEDMAN, T. D., “ALERT: A Program to Compile Logic Designs of New Computers,” 


Digest of the First Annual IEEE Computer Conference, September 6, 1967, рр. 128-130. 


. GORMAN, D. F., “Systems Level Design Automation: A Progress Report of the System 


Descriptive Language (SDL П),” Digest of the First Annual IEEE Computer Conference, 
September 6, 1967, pp. 131-134. 


. DARRINGER J. A., “А Language for the Description of Digital Computer Processor,” 


Proceedings for the Design Automation Workshop, July, 1968. 


. Durey, J. R., and Dietmeyer, D. L., “А Digital System Design Language (DDL)," 


IEEE Transactions on Computers, September, 1968, pp. 850-861. 


. Сно, У., Introduction to Computer Organization. Englewood Cliffs, N.J.: Prentice-Hall, 


Inc., 1970. 


Problems 


1.1. Given register SR(0—35), declare by a subregister statement a subregister for the sign 


bit SR(0), a subregister for the characteristic bits SR(1-8), and a subregister for the 
fraction bits SR(9—35). 


1.2. Given a decoder described by the statements, 


Register, С(1-3), 
Decoder, N(0-7)=C, 


use a terminal statement to describe the decoder instead of the decoder statement. 


1.3. Give the four-position switch N(ONE, TWO, THREE, OFF) and three lights 


LTONE(ON, OFF), LTTWO(ON, OFF), and LTTHREE(ON, OFF). When switch 
N is at the OFF position, none of the lights are at the ON condition. When switch N is 
at the ONE position, only light LTONE is at the ON condition. When switch is at the 
TWO position, lights LTONE and LTTWO are at the ON condition. When switch 
is at the THREE position, all three lights are at the ON condition. Describe the above 
switch-light logic by micro-statements. 


Problems 29 


1.4. 


1.5. 


1.6. 


1.7. 
1.8. 


1.9. 


1.10. 


A single-bit full adder-subtracter is a logic network which performs a single-bit addi- 
tion or subtraction if the associated single-bit register AS is 0 or 1, respectively. The 
adder-subtracter has three inputs: augend or minuend, addend or subtrahend, and 
input carry or input borrow. It has two outputs: sum or difference and output carry 
or output borrow. Describe the full adder-subtracter by a terminal statement. 


If single-bit registers A and B are added arithmetically with the resulting sum bit and 
carry bit to be stored in single-bit registers C and D respectively, use conditional 
micro-statements to describe the addition. 


Give one numerical example each for the two possible cases of overflow when two 
binary numbers in the signed 275 complement representation are added. 


Repeat Problem 1.6 when one binary number is subtracted from the other. 


Replace the increment micro-operations in statements (1.29) by constant-setting micro- 
operations if the four sequences described in statements (1.29) are advanced according 
to the contents of register F following the order of 00, 10, 01, 11, 00,.... 


If light LTSTOP(ON, OFF) is additionally provided in the computer described in 
statements (1.30), modify the statements (1.30) (1.31) so that this light is turned to the 
ON or OFF condition when register G becomes 0 or 1, respectively. 


If decoder terminal K(10) is selected to command a count sequence which counts the 
number of 1’s in the accumulator, describe the count sequence and incorporate it in 
the sequence chart of Fig. 1.5 and in the statement description (1.31). Additional 
elements may be employed if they are needed. 


. Simulate the stored-program computer described by statements (1.30) and (1.31) by 


an algorithmic language such as FORTRAN, ALGOL, or PL/I. 


Chapter 1 introduces the Computer Design Language by describing the organiza- 
tion of a simple, stored-program computer. This chapter presents a number of 
simple organizations as additional examples in showing the use of CDL. With these 
examples, it should be clear to the reader that both the hardware and the software 
implement algorithms, but the constraints are different. The balance between the 
hardware and software in a computer system is constantly changing as both 
computer technology and applications progress. 


Some Organizations 2 


2.1 А Serial Parity Generator 


The parity bit for a binary word is an extra bit provided to check a single error 
in the binary word resulting from the possible malfunction of the circuitry. It can be 
odd or even. Serial generation of the parity bit for a binary word is described below. 


2.1.1 Generation of a Parity Bit 


Consider a binary word in register А(1-6) with its parity bit stored in the single- 
bit register PB. If the parity bit is so chosen that the number of 1’s in casregister 
А-РВ is odd, the parity bit is called an odd parity bit. Conversely, if the parity bit is 
so chosen that the number of 175 is even, the parity bit is called an even parity bit. 
For example, if the contents of casregister А-РВ are 0010101, the number of 1’s in 
the casregister is an odd number of three. For an odd parity, register PB correctly 
contains one. For an even parity, PB should contain zero so that the number of 175 
in the casregister is an even number of two. 

The above definition of the odd parity bit for the binary word in casregister 
А-РВ gives, 


АСО)ФАС)ФА(ЗУ)ФА(ФФА(5УФА(б) ӘРВ--1 (2.1) 


where symbol Ф denotes a logical EXCLUSIVE-OR operator. When the following 
two Boolean relations 


PB@PB=0 (2.2) 
and 

1@1=0 (2.3) 
are used, equation (2.1) can be written as, 

РВ=А(1)ФА(2)ФА(3)ФА(4)ФАС)ФА(6)Ф1 (2.4) 


This is the equation to be used when generating the odd parity bit. The definition of 
the even parity bit for the binary word in casregister A—PB gives, 


А()ФА(2)ФАСЗ)ФА(4)ФАС)ФА(6)ФРВ=0 (2.5) 


32 


Sec. 2.1 A Serial Parity Generator 33 


which can be written as 


РВ=А()ФА(2)ФА(3)ФА(4)ФАС)ФА(6)Ф0 (2.6) 


This is the equation which can be used to generate the even parity bit. 
2.1.2 Configuration 


A configuration for generating a parity bit serially is shown in Fig. 2.1. Register 
A is the shift register where the binary word is stored. Register PB stores the parity 


2-4” 


(ОМ, ОҒЕ) 


START 


Fig. 2.1 Configuration for a serial parity generator 


Clock 


Control 
signals 


bit of the binary word. Counter C counts the number of shifts in register A. Counter 
T and clock P generate the control signals. Switch START initializes the necessary 
operations for the generating sequence, and light FINI indicates the completion of 
the operation. These elements are described by the following declaration statements: 


Comment, configuration of a serial parity generator. (2.7) 
Register, А(1-6), $shift register 

PB, $parity bit register 

C(0—2), $counter 


T(1-4), $control register 


34 Сһар.2 SOME ORGANIZATIONS 


Switch, START(ON), 
Light, FINI(ON,OFF), 
Clock, P 


2.1.3 Sequence Chart 


The generation of the parity bit is described by the sequence chart in Fig. 2.2. 
The binary word is assumed initially in register A. As shown in Fig. 2.2, when switch 
START is turned to the ON position, counter C is reset to 0 and light FINI is set to 
the OFF condition. To generate an odd parity bit, register PB is initially set to 1. 


START(ON) 


С-0, 
PB<-1, 
ҒІМІ-ОҒЕ, 


РВ+А(6)еРВ 
C<countup С 


А+ сіг А 


FINI<ON 


End 


Fig. 2.2 Sequence chart for a serial parity generator 


Sec. 2.2 Serial Comparators 35 


Then, the logical EXCLUSIVE-OR operation is performed and at the same time 
counter C is incremented by 1. Register A is then circularly shifted one bit to the 
right. Counter C is then tested for the value of 6. If the value is not 6, the logical 
operation, shifting, and counting are repeated until the value reaches 6. By then the 
desired parity bit is in register PB and light FINI is turned to the ON position to 
indicate completion. 


For generating an even parity bit, PB is initially reset to 0; the other steps remain 
the same. 


2.1.4 Sequence Description 


The sequential operations are now described by the following execution state- 
ments: 


Comment, here begins generation of parity bit (2.8) 
/START(ON)/ PB-1, 

FINI—OFF, 

C—90, 

T—10,, 
/T(1)*P/ PB<—A(6)@PB, 

C—countup С, 

T(1,2)—01, 
/T(2)*P/ А <сіг А, 

Т(2,3)<01, 
/T(3)*P/ IF (C=6) THEN (T(3,4)<-01) ELSE (T(3,1)—01), 
/T(4)*P/ FINI<—ON, 

Т(4)<-0, 

End 


2.2 Serial Comparators 


Two binary numbers can be compared serially to determine whether one is 
greater, smaller, or equal to the other. Sequences to compare two unsigned binary 
numbers and two signed binary numbers are described below. 


2.2.1 Serial Comparison of Two Binary 
Numbers 


Unsigned binary numbers are positive numbers. When two positive numbers 
are compared serially, the comparison algorithm is determined by the direction in 


36 Chap.2 SOME ORGANIZATIONS 


which these numbers are scanned. When they are scanned and compared bit by bit 
from right to left (i.e., from the least significant bit to the most significant bit), this is 
equivalent to a serial subtraction. Therefore, the result of the comparison is unknown 
until the borrow from the most significant bits becomes available or until every pair 
of respective bits is found equal. When two positive binary numbers are scanned and 
compared from left to right, the result of the comparison can be established as soon 
as two respective bits are found unequal, unless the two numbers are equal; in the 
latter case, all the bits of the numbers have to be scanned. 

Signed binary numbers can be either positive or negative. When two signed 
binary numbers are compared serially, the comparison algorithm depends not only 
upon the direction in which these numbers are scanned, but also on the manner in which 
these numbers are represented. Only binary numbers in the signed magnitude repre- 
sentation are considered here. If the signs of the two compared signed binary numbers 
are different, the number with the positive sign is the larger number. If the two signs 
are the same and the numbers are scanned from left to right, the number with the 
first larger bit (1.e., 1) is the larger one if both signs are positive. The number with the 
first smaller bit (i.e., 0) is the larger one if both signs are negative. When the two 
signed numbers are scanned from right to left, this is again equivalent to a serial 
subtraction and the result of the comparison is not known until the borrow from the 
most significant bits becomes available. In whichever direction the numbers are 
scanned, the two numbers are equal only when their respective bits are equal. 


2.2.2 Configuration for Comparing Unsigned 
Binary Numbers 


The configuration for serial comparison of two unsigned binary numbers is 
shown in Fig. 2.3. Registers А and B store the two unsigned binary numbers to be 
compared. Registers GT, EQ, and LT indicate the result of comparison. Counter C 
counts the number of times. Register T, clock P, and the decoder with output terminals 
K's generate the control signals. Switch START initializes the necessary operations 
for the comparison sequence. Light FINI indicates the completion of the comparison. 
These elements are described by the following declaration statements: 


Comment, configuration of serial comparator for two unsigned numbers (2.9) 


Register, EQ, $indicate equal when 1. 
GT, $indicate А is greater than B when 1. 
LT, $indicate А is less than В when 1. 
SIGN, $store sign. 
A(0—7), $store one number. А(0) is the sign bit. 
B(0-7), $store the other number: B(0) is the sign bit. 
C(0—2), $counter 


T(0—3), $control register 


A(0-7) 


One-bit 
comparitor 


Decoder 


Logic 
network 


ON 
FINI 


(ON, OFF) 


Fig. 2.3 Configuration for a serial comparator 


Control 
signals 


e 
- 


- 


SIGN 


37 


38 Chap.2 SOME ORGANIZATIONS 


Decoder, К(0-8) = Т 
Switch, START(ON) 
Light, FINI(ON,OFF) 
Clock, Р 


A single-bit comparator with three outputs is shown in the configuration in Fig. 
2.3. These outputs indicate that the contents of casregister А(0)-В(0) are 01, 00 or 11, 
and 10 which represent respectively that bit A(0) is smaller than, equal to, or greater 
than bit В(0). When casregister А(0)-В(0) contains 01, 00 ог 11, and 10, they are 
respectively represented by Boolean expressions A(0)'«B(0), A(0)(-)B(0), and А(0) 
*В(0)'. These Boolean expressions will be used later. 


2.2.3 Sequence Chart for Comparing Unsigned 
Binary Numbers 


Serial comparison of two unsigned binary numbers is shown in the sequence 
chart in Fig. 2.4. The two numbers are assumed initially to be in registers A and B. 
When switch START is turned to the ON position, register EQ is set to 1 and registers 
LT and GT are reset to 0. In addition, counter C is reset to 0 and the light FINI is 
turned to the OFF condition. The leftmost bits of registers A and B are examined. 
If bit A(0) is larger than bit В(0), register GT is set to 1 and register EQ is reset to 
0. If bit A(0) is smaller than bit В(0), register LT is set to 1 and register EQ is reset 
to 0. Nothing happens when these two bits are equal. In each of these three cases, 
counter C is incremented by 1 and both registers A and В are circularly shifted one 
bit to the left. Counter C is then tested for 8. If counter C does not contain 8, registers 
GT and LT are tested for 1. If either of them contains а 1 (1.е., Boolean expression 
GT + LT = 1), the comparison is omitted. If neither contains а 1, casregister А(0)- 
В(0) is tested again, and the micro-operations of incrementing counter С and shifting 
registers A and B are again performed. Counter C is again tested for 8. If counter 
C does not contain 8, the testing and micro-operation just described are repeated 
until counter C reaches 8. At this time, light FINI is turned to the ON position. 
The comparison is now completed. 

It should be noted that the above algorithm leaves the contents of registers A 
and B unchanged at the end of the comparison. 


2.2.4 Sequence Description for Comparing 
Unsigned Binary Numbers 


The sequential operations described by the sequence chart in Fig. 2.4 are now 
described by the following execution statements: 


Comment, here begins the comparison sequence for unsigned numbers (2.10) 
/START(ON)/ EQ-I, LT—0, GT—0, C—0, FINI<-OFF, T—0, 


START(ON) 


GT+LT=1 


А(0)' *В(0) 


C<countup C, 
A<cil A, 
B<cil B, 


FINI<ON, 


End 


Fig. 2.4 Sequence chart for serial comparison of two unsigned 


binary numbers 


39 


40 Chap.2 SOME ORGANIZATIONS 


/K(0)*P/ IF (GT+LT=1) THEN (T—2) ELSE (T—1), 

/K(1)*P/ IF (А(0)-В(0)=А(0)’»В(0)) THEN (LT<-1, EQ-—0), 
ТЕ (A(0)-B(0)— A(0)«B(0)) THEN (GT<-1, EQ<0), 
T<2, 

/K(2)*P/ C«—countup С, Accil A, Ве-сі B, Т3, 

/K(3)«P/ IF (C=8) THEN (T—4) ELSE (Т0), 

/K(A)«P/ FINI—ON, 
END 


2.2.5 A Serial Comparator for Signed Binary 
Numbers 


The configuration for serial comparison of two binary numbers in the signed 
magnitude representation is the same as the one shown in Fig. 2.3. It can also be 
described by statements (2.9). It should be noted, however, that register SIGN which 
is provided for use in comparing signed binary numbers is not used in the description 
of (2.10). 

The sequence chart for the serial comparison is shown in Fig. 2.5. The two signed 
numbers are assumed initially to be in registers A and B. After similar initialization 
by switch START, registers A and B are tested for possible negative zero. If negative 
Zero occurs, it is reset to positive zero. The sign in bit A(0) is next stored in register 
SIGN for later reference. The first two bits to be compared are sign bits. In this case, 
01, 00 or 11, and 10 in casregister А(0)-В(0) represent, respectively, that the number 
in register A 1$ larger than, possibly equal to, or smaller than the number in register 
B. This rule of comparison also applies to the magnitude bits when the two numbers 
are both negative. When the two numbers are both positive, the rule for comparing 
magnitude bits are as follows: the number in register A is smaller than, possibly equal 
to, or larger than the number in register B when casregister А(0)-В(0) contains 01, 00 
or 11, and 10, respectively. The opposite nature of these two rules requires the use 
of register SIGN and makes the comparison algorithm in Fig. 2.5 more complex than 
that in Fig. 2.4. Otherwise, the two comparison algorithms in Figs. 2.4 and 2.5 are 
similar. 

The sequential operations of the serial comparator for the two signed binary 
numbers are now described by the following execution statements: 


Comment, here begins the comparison sequence for signed numbers (2.11) 
/START(ON)/ EQ-1, LT<-0, GT—0, C—0, FINI—OFF, Т<-0, 
/K(0)«P/ IF (A=200,) THEN (A—0), 

IF (B=200,) THEN (В<-0), 

T—1, 


START (ON) 


EQ+1, 

(7-0, 
GT-0, 
C-0, 

FINI-OFF, 


IF (A-2005) THEN (A-0), 
IF (B=200,) THEN (B -0), 


SIGN- A(0) 


GT+LT=1 
SIGN=1 


A(0)0B(0) 


A(0) * B(O)' A(0) *B(O)' 


C-—-countup C, 
A-cil A, 
B-cil B, 


FINI—- ON 


End 


Fig. 2.5 Sequence chart for serial comparison of two signed 
binary numbers 


41 


42 Сһар. 2 SOME ORGANIZATIONS 


/К(1)*Р/ SIGN-—A(0), Т<4, 

/K(2)*P/ IF (GI+LT=1) THEN (Т<-6) ELSE (T——3), 

/K(3)«P/ ТЕ (SIGN=1) THEN (Те-4) ELSE (Т<-5), 

ІК(ӘУ«Р/ ТЕ (А(0)-В(0)-- A(0)«B(0)) THEN (Г.Т‹-1, EQ<0), 
IF (А(0)-В(0)-- A(0)'«B(0)) THEN (GT<—1, EQ-—0), 
Т<-6, 
ТЕ (А(0)-В(0)-- A(0)'«B(0) THEN (СТ<-І, EQ<0), 
Т<-6, 

/K(5)«P/ IF (A(0)-B(0)— А(0)'*В(0)) THEN (LT<-1, ЕО<-0), 
IF (А(0)-В(0)-- A(0)«B(0)) THEN (GT-—1, EQ-—0), 
Т6, 

/K(6)*P/ C<countup C, A<cil A, B<cil В, T<—7, 

[K(7)*P/ IF (C=8) THEN (T—8) ELSE (T<2), 

/K(8)*P/ FINI—ON, 
END 


2.3 Finding the Largest Number 


The purpose of this organization is to find the largest element among 7 
unsigned binary numbers. Two versions are shown: one by parallel subtraction and 
the other by serial subtraction. 


2.3.1 An Algorithm of Finding the Largest 
Number 


An algorithm to find the largest element among given binary numbers, Х(1), 
..., X(n) is shown in Fig. 2.6 where n is the number of elements, m the current 
largest element, k the pointer which points to the element now being compared, and 
j the pointer which points to the current largest element. As shown in Fig. 2.6, the 
first comparison is between elements Х(п) and X(n — 1) of which the larger is stored 
in m. Next is the comparison between m and X(n — 2) followed by the comparison 
between m and X(n — 3) and so forth during which т always stores the larger ele- 
ment after each comparison. Pointer k begins from (n — 1) and is decremented after 
each comparison. The searching process terminates when k reaches 0. 


2.3.2 Version A 


Version A describes an implementation using а parallel subtracter. Let the. given 
binary numbers be unsigned integers stored in memory X with address register C 


Sec. 2.3 Finding the Largest Number 43 


jen, 
ке(п-1), 
m<X(n), 


Terminate 


Fig. 2.6 Flowchart to find the largest among п elements 


and buffer register В. Let the capacity of memory Х be 1024 words with the word 
length of 24 bits. Assume that number л is stored in the first location and the integers 
in the succeeding locations of the memory. Registers J and K store pointers j and k, 
respectively. Register A stores the current largest integer after each comparison. 
In addition, there are control register T, switch START, and light FINI as shown in 
the block diagram in Fig. 2.7. 

The process of finding the largest element is illustrated in the sequence chart in 
Fig. 2.8. After initialization during which register C is reset to 0 and the light FINI 
is turned to the OFF condition, the first word, number л, is read out of memory X. 
Number 7 is then transferred to memory address register C to read out the last ele- 
ment of the given n elements. This last element is stored in register A. The last second 
element is next read out of memory X and stored in buffer register R. The numbers 
in registers А and В are compared by the parallel subtracter. Terminal BOR(O) is 
the borrow bit from the leftmost stage of the parallel subtracter. When terminal 
ВОК(0) is 1, this indicates that the unsigned binary integer in register A is smaller 
than that in register R. In this case, the larger number in register R is transferred to 


44 Chap. 2 SOME ORGANIZATIONS 


ВОН(0) 


FINI(ON, OFF) 


Ko 
Control 
Kio signals 


Fig. 2.7 Configuration for finding the largest element with а 
parallel subtracter 


register A and the memory address where this larger number is stored in memory 
X is transferred to register J. The next element is then taken out of the memory and 
compared with the number in register R. Again, the larger number and its memory 
address are stored in registers А and J, respectively. This process continues until К 
reaches 0. By that time, all the elements are compared. The largest element is in 
register À and its memory address is in register J. 

The above configuration for locating the largest integer is now described by the 
following statements: 


Comment, configuration with a parallel subtracter (2.12) 
Register, C(0—9), $address. register 


Fig. 2.8 Sequence chart for finding the largest element (version 


A) 


START(ON) 


C-0, 
FINI-OFF, 


K-R(ADDR) 


А-В 
K<countdn К 


K=0 
m 


+ 


End 


45 


46 


Subregister, 
Memory, 
Decoder, 
Switch, 
Light, 


Terminal, 


Chap. 2 SOME ORGANIZATIONS 


R(1—24), $buffer register 

J(0—9), $store pointer / 

K(0—9), $store pointer k 

A(1—24), $store current largest element 
T(0-3), ` $control register 


R(ADDR)- R(15-24), 

X(C)=X(0-1023, 1-24), 

M(0-10)=T 

START(ON), 

FINI(ON,OFF), 

DIFF(1-24)=A(1-24)@R(1-24)@BOR(1-24), 

BOR(0-23)— А(1-20»Е(1-24)--Е(1-24)«ВОВ(1-24) 
+BOR(1-24)*A(1-24), 

BOR(24)=0, 


Comment, here begins the comparison sequence 
/START(ON)/ С<0, FINI<-OFF, T<—0, 


/M(0)«P/ 
/M(1)«P/ 
/M(2)*P/ 
/M(3)*P/ 
/M(4)*P/ 


/M(5)«P/ 
/M(6)«P/ 
/M(7)*P/ 
/M(8)«P/ 
[M(9)«P/ 
/M(10)«P/ 


R—X(C) T—1, $read out n 
K-——R(ADDR), T—2, $store nin К 
C—K, JK, T—3, $store n in J and C 


R-—X(C), T<-4, $read out X(n) 

А-В, $store X(n) in А 
K<—countdn К, Sobtain (n-1) 

T<5, 

IF (K=0) THEN (T—10) ELSE (T-—6), 

C—K, T<7, 

R—X(C), T—8, $read next element X(C) 


IF (ВОК(0)= 1) THEN (A-R, J-—K), T—9, 
K<countdn К, Т<-5, 

FINI—ON, 

END 


The terminal statement in (2.12) describes the parallel: subtracter. Terminals 
DIFF which are the difference outputs of the subtracter are not needed; they are 
merely shown for clarity. 


Sec. 2.3 Finding the Largest Number 47 


2.3.3 Version В 


Version В describes an implementation by using а serial subtracter. This version 
has the same elements as those described in statements (2.12) except for the follow- 
ing: (a) counter BC is added for counting the steps during serial subtraction, and 
(b) a single-bit subtracter together with borrow register B replaces the parallel sub- 
tracter. This configuration is shown in the block diagram in Fig. 2.9. 

The process of finding the largest integer is shown in the sequence chart in Fig. 
2.10. The sequence charts in Figs. 2.8 and 2.10 are almost identical with the exception 
that there are two loops in Fig. 2.10 and only one in Fig. 2.8. The inner loop in Fig. 
2.10 is served for serial subtraction of two numbers. 

The configuration and sequential operation of version B described by the 
following statements: 


Comment, configuration with a serial subtracter (2.13) 
Register, C(0-9), $address register 
Е(1-24), $buffer register 
0-9), $store pointer / 
К(0-9), $store pointer k 
А(1-24), $store current largest element 
B, $borrow register 
ВС(0-4), $bit counter 
T(0-3), $control register 
Subregister, R(ADDR)=R(1I5-24), 
Memory, X(C)=X(0-1023, 1-24), 
Decoder, М(0-12)--Т, 
Switch, START(ON), 
Light, FINI(ON,OFF), 
Terminal, DIFF=A24@QR(24)@B, 


BORROW=A(24)*R(24)’ + R(24)’*B+ B«A(24), 
Comment, here begins the sequence 


/START(ON)/ С<-0, FINI—OFF, B—0, BC-—0, Т0, 


/M(0)«P/ R«—X(C), T—1, $read out n 
/M(1)*P/ K<R(ADDR), T—2, $store nin K 
/М(2)*Р/ J—K, C-K, T—3, $store n in J and in C 


/M(3)*P/ R«—X(C), Т<-4, $read out X(n) 


48 


| 


X(0-1023, 1-24) 


R(ADDR) 


FINI(ON, OFF) 


Control 
signals 


Fig. 2.9 Configuration for finding the largest element with a 
serial subtracter ` 


ЗТААТ(ОМ) 


C--0, 
в=0, 

BC--0, 
FINICOFF, 


ВС<0, 
K<countdn К, 


R<cir В, 
B-BORROW, 
A*cir A, 
BC<countup BC, 


Fig. 2.10 Sequence chart for finding the largest element (ver- 
sion B) 


End 


49 


50 Chap.2 SOME ORGANIZATIONS 


/M(4)*P/ A—R, $store X(n) in A 
K<countdn К, Sobtain (n— 1) 
/M(5)*P/ IF (K=0) THEN (T—12) ELSE (Т<-6), 
/M(6)«P/ C—K, T—7, 
/M(7)*P/ R—X(C), Те 8, $read next element X(C) 
/M(8)«P/ R<cir В, BX BORROW, A<cir A, 
BC<countup BC, T<9, 
/M(9)«P/ IF (ВС-=24) THEN (T<8) ELSE (Т<-10), 
/M(10)«P/ IF (B=1) THEN (A—R,J-——K), T<11, 
/M(11)«5/ K-—countdn К, BC—0, T—5, 
/М(12)*Р/ FINI—ON, 
END 


2.4 A Prime Number Generator 


A prime number is an integer greater than 1 not divisible by any number 
except 1 and itself. For example, 2, 3, 5, and 7 are prime numbers. Ап algorithm for 
generating prime numbers will first be described in a flow chart for programming 
purposes. It will then be described in a sequence chart for hardware implementation, 
and thus serve as an example of how a "software description" can be translated into 
a “hardware description.” 


2.4.1 Generation of Prime Numbers 


Let N be a possible prime number and PRIME be the table where the generated 
prime numbers are stored. Associated with table PRIME are two indices, J and K. 
J points to the last entry of table PRIME where a prime number is entered. K is a 
pointer for scanning table PRIME. Thus, РАТМЕК) is the kth entry of table PRIME. 

The first two prime numbers, 2 and 3, are initially placed in the first two entries 
of table PRIME. The succeeding prime numbers are generated according to the 
following rules: 


1. Possible prime numbers №5 must be odd integers. 


2. Quotient Q and remainder R obtained from the following division of N by 
PRIME(K), 


N 
PRIMER 9 А (2:1) 
are used to determine whether N is a prime number as follows: For a given N, K is 
chosen to be 2, PRIME(K) is 3. Then the division іп (2.14) is performed. If R is zero, 
N is not a prime number. If R is not zero, and if Q is less than or equal to PRIME(K), 


Sec. 2.4 А Рите Number Generator 51 


Q < PRIME(K) (2.15) 


Nis а prime number. Otherwise, the next entry PRIME(K - 1) is chosen from table 
PRIME and the above division is again performed until either R is zero or Q is less 
than or equal to РЕТМЕ(К). 


2.4.2 "Software" Description 


A flowchart showing the algorithm which generates the first 1000 prime numbers 
by using the rules stated above is shown in Fig. 2.11. For example, generation of the 
first six prime numbers by this algorithm is shown in Table 2.1, where test signifies 


Start 


РБІМЕ(1)<2, 
№3, 
ЈЄ1, 


ЈЄЈ+1, 
PRIME(J)<N, 


End 


N/PRIME(K)=Q+R 


N is not prime 


Q<PRIME(K) 


< 


N is a prime number 


Select next prime number 


Fig. 2.11 Flowchart for generating the first 1000 prime numbers 


52 Chap.2 SOME ORGANIZATIONS 


the division in (2.14). The first two prime numbers 2 and 3 are known. The first № 
is chosen to be 5 and each succeeding М is obtained by incrementing N by 2 as a 
result of the rule 1. As shown in Table 2.1, 5, 7, 11, 13, 17, and 19 are found to be 
prime numbers. 


2.4.3 "Hardware" Description 


Let all the numbers be binary. Let memory PRIME store the generated prime 
numbers. For 1000 prime numbers, the memory is chosen to have a capacity of 1024 
words. Since the 1000th prime number is 7919, the memory requires a word length 
of 13 bits. Let registers C and B be the address register and buffer register of the 
memory, respectively. Registers J and K store indices J and K, respectively. Counter 
N generates and stores a new integer as a possible prime number. Register A serves 
as an accumulator. A parallel subtracter which subtracts the contents of register B 
from those in register A and produces difference DIFF and borrow BOR is provided. 
Register UNDER is used to indicate possible borrow from subtraction. In addition, 
switch START and light FINI are employed to initialize the generation and to indi- 
cate the completion, respectively. Register T, its associated decoder, and clock P 
generate control signals. This configuration is shown in the block diagram in Fig. 
2.12, and can be described by the following declaration statements: 


TABLE 2.1 Generation of the First 6 Prime Numbers 


Test 
N PRIME (К) Q R Result 
5 3 1 3 N is prime as O<3. 
7 3 2 1 N is prime as Q<3. 
9 3 3 0 М is not prime as К--0. 
11 3 3 3 N is prime as О<3. 
13 3 4 $ Continue to test as О>>3. 
13 5 2 + N is prime as O<S. 
15 3 5 0 N is not prime as R—0. 
17 3 5 3 Continue to test as О>>3. 
17 5 3 2 N is prime as 0<<5. 
19 3 6 4 Continue to test as О> 3. 
19 5 3 1 N is prime as Q5. 
Comment, configuration for generating prime numbers (2.16) 
Memory, РЕТМЕС)=РЕМЕ(0-1023,0-12) 
Register, C(0-9), Saddress register 
В(0-12), $buffer register 


Х0-9), $index j 


Sec. 2.4 A Рите Number Generator 


PRIME(0-1023, 0-12) 
В(0-12) А(0-12) 


Parallel subtracter 


FINI(ON, OFF) 


| Control ON 


Logic 


network signals 


Fig. 2.12 A configuration for generating the first 1000 prime 


numbers 
K(0-9), $index k 
М(0-12), $store possible prime number 
A(0-12), $accumulator 
Q(0-12), $counter 
UNDER, $indicate borrow during subtraction 
T(0—3), $control register 


Terminal, DIFF(0-12)=A(0-12)@B(0-12)\@BOR(0-12), 


53 


BOR(0-11)—A(1-12)4B(1-12)-- (1-12) «BOR(1-12) - BOR(1-12) 


*A(1-12), 
BOR(12)—0, 


54 Chap. 2 SOME ORGANIZATIONS 


Decoder, М(0-11)--Т, 
Light, FINI(ON,OFF), 
Switch, START(ON), 
Clock, P, 


The above terminal statement describes the previously mentioned parallel subtracter. 

After examining the flowchart in Fig. 2.11, the most complex operations are 
found to be those for division (2.14) and inequality test (2.15). The division can be 
replaced by repeated subtraction and counting. After each subtraction, register А is 
tested for 0. If register A is 0, it means that remainder R is О and the integer in register 
N is not a prime number. If register А is not 0, the quotient is obtained by counting 
the number of successful subtractions in register О. А successful subtraction is one 
where the difference is not negative; this requires that the contents of register А 1s 


START(ON) 


C+, 
J, 
N-3, 
FINICOFF, 
B-2, 


PRIME(C) -B 


FINI<ON 


End 


Fig. 2.13(a) Sequence chart for generating the first 1000 prime 
numbers | 


N<2 countup М, 
К-2, 


B<PRIME(C), 
0<0, 
A<DIFF, 

UNDER-BOR(0), 


Not prime 


Q<countup Q K<countup К 


N is prime number 


Fig. 2.13(b) Sequence chart for generating the first 1000 prime 
numbers 


55 


56 Chap. 2 SOME ORGANIZATIONS 


not zero and that the borrow does not occur. The quotient in register Q is next 
transferred to register A so that the inequality test (2.15) can be performed by the 
parallel subtracter. If borrow does not occur (UNDER = 0) or the contents of 
register A are 0 (A = 0) after the subtraction, then the number in register М 15 a 
prime number. Otherwise, the next larger prime number is read out of the memory 
for use as the divisor in division (2.14). This process continues until either the remain- 
der is 0 or the inequality test is fulfilled. The former means that the integer in register 
N is not a prime number, while the latter means that it is a prime number. This 
algorithm for generating the prime numbers by hardware is shown in the sequence 
chart of Fig. 2.13. 

The generating sequence shown in the sequence chart of Fig. 2.13 is now described 
by the following execution statements: 


Comment, here begins the generation sequence (2.17) 
/START(ON)/ C<-1, J——1, N<3, B-—2, FINI<-OFF, Т<-0, 
/M(0)«P/ PRIME(C)—B, T—1, 
/M(1)*P/ J<countup J, T<2, 
/M(2)*P/ C—J, B——N, T—3, 
/M(3)«P/ PRIME(C)<B, T—4, 
/M(4)*P/ IF (J=1000) THEN (FINI-—ON) ELSE (T<5), 
/M(5)*P/ М- 2 countup М, K —2, T—6, 
/M(6)«P/ A<N, C—K, T—7, 
/M(T)«P/ B-—PRIME(C), Q—0, T<8, 
/M(8)«P/ А < ПІЕЕ, UNDER -—BOR(O0), Т<-9, 
/M(9)*P/ IF (A=0) THEN (Т<-5) ELSE (IF (UNDERz 1) 
THEN (Q<countup О, T<-8) ELSE (A-—Q, 

T<10)), 
/M(10)«P/ A-—DIFF, UNDER -—BOR(0), T—11, 
/M(11)*P/ ТЕ (UNDER=0)*(A+0)) THEN (K<countup К, T—6) 

ELSE (T—1), 
End 


2.5 A Gray-to-binary Code Converter 


If a digital signal is represented by a binary number, there will be one or more 
bit-changes (a bit change is the change of the value of a bit from 0 to 1 or from 1 to 
0) when the number is incremented by one. Bit changes cause ambiguity in sensing 
coded patterns from certain types of analog-to-digital converters. This difficulty can 
be avoided if a Gray coded binary number, developed by F. Gray in 1953, is used. 


Sec. 2.5 А Gray-to-binary Code Converter 57 


A converter which serially converts a 4-bit Gray code into a binary code is described 
below. 


2.5.1 Gray Code to Binary Code Conversion 


The unique characteristic of Gray coded binary numbers, or simply Gray code 
(5), is that there is one and only one bit-change between any two neighboring num- 
bers. An example of 16 numbers of a 4-bit Gray code is shown in Table 2.2, where 
there is only one bit-change between any two neighboring numbers. 


TABLE 2.2 A 4-bit Gray Code 


BINARY CODE Gray CODE 

B, В; В, Bi G4 б; G2 G; 

ооо 0 ооо о 9 
ооо 1 ооо 1 Í 
оо 1 0 оо 1] 1 3 
оо 1 1 0 0 | 0 1 
о 1 о о о 1 1 0 c 
о 1 о 1 0 1 1 1 7? 
0 1 1 0 0 1 0 1 5 
0 1 1 1 0 1 о о & 
1 оо о 1 1 0 0 17 
1 0 0 1 1 1 0 1 17 
1 о 1 0 1 1 1 1 1/5 
1 0 1 1 1 1 1 0 1% 
1 1 оо 1 0 1 0 10 
1 1 0 1 тг 
1 1 1 0 1 oo 1 4 
1 1 1 1 1 ооо $ 


Table 2.3 is the truth table which shows the relation between the 4-bit Gray 
code and the 4-bit binary code. In this table B,, B,, В,, and B,, the four outputs of 
the converter, represent the four bits of the binary code with weights 23, 22, 21, and 
2°, respectively. G,, G,, G,, and G,, which represent the four bits of the Gray code, 
are the four inputs of the converter. Boolean relations between the outputs and the 
inputs of the serial converter can be readily obtained from the truth table. They are, 


В, =P, + Pot Pio + Pir + Piz + Pis + Pi, + Pis 
В, = P, + P, + Ps + P, + P, + Po + Pio + Pi, 
В, = P, + P, + P, + P; + P, + P, + P, + Pus 
B, = P, + P, + P, + P, +P, +Р,, + Р,з + Piy 


(2.18) 


58 
TABLE 2.3 Truth Table for Gray-to-binary Code 
Conversion 
G4 Сз С» Сб, В. В; B2 B, 
0 0 0 0 0 0 0 0 
0 0 ^O 1 0 0 0 1 
0 0 1 0 0 0 1 1 
0 0 1 1 0 0 1 0 
0 1 0 0 0 1 1 1 
0 1 0 1 0 1 1 0 
0 1 1 0 0 1 0 0 
0 1 1 1 0 1 0 1 
1 0 0 0 1 1 1 1 
1 0 0 1 1 1 1 0 
1 0 1 0 1 1 0 0 
1 0 1 1 1 1 0 1 
1 1 0 0 1 0 0 0 
1 1 0 1 1 0 0 1 
1 1 1 0 1 0 1 1 
1 1 1 1 1 0 1 0 
where 


P, = G,'*G,'*G,'«G,, 
Р, = G,'*G,'+G,*G,’, 


Pi, = G,*«G,*G,*G,, 
Р,; = G,+G,*G,«G,, 
The above relations can be simplified into 
B, = б,, 
B, = G,OG,, 
В, = G,6G,0G,, 
B, = G,00G,00G,(0G,, 


The above relations can also be written 


B, = 0@G,, 

B, = В.ФС,, 
В, = В,Ф6,, 
B, = В. ФС, 


Chap.2 SOME ORGANIZATIONS 


(2.19) 


(2.20) 


(2.21) 


The iterative nature of the above Boolean relations (2.21) accounts for the fact that 


Sec. 2.5 А Gray-to-binary Code Converter 59 


the 4-bit Gray code can be serially converted into the 4-bit binary code by means of 
a simple logical EXCLUSIVE-OR block. The following macro design of a serial 
converter makes use of these Boolean relations. 


2.5.2 Configuration 
A configuration for converting the 4-bit Gray code in Table 2.3 into the binary 


code is shown in the block diagram of Fig. 2.14. Terminals IN(4-1) are the input 
terminals for the 4-bit Gray code. Register A is a shift register which accepts the 


IN(4-1), input terminals 


Sa ee 


—— 
С(0-2) Output terminals 
FINI(ON, OFF) 


Logic network 


ON 


Control signals 


Fig. 2.14 Configuration for a serial Gray-to-binary code con- 
verter 


60 Сһар. 2 SOME ORGANIZATIONS 


Gray code and stores the binary code after conversion. Single-bit register D serves 
as a temporary storage. Counter C counts the number of times. Register T and 
clock P generate control signals. Switch START initiates the conversion sequence 
and light FINI indicates the completion of conversion. This configuration is now 
described by the following declaration statements: 


Comment, configuration for a 4-bit Gray-to-binary code converter (2.22) 
Register, A(4-1), $shift register 

C(0-2), $counter 

T(1-5), $control register 

D, $operation register 
Terminal, IN(4-1), $Gray code input lines 


Switch, START(ON), 
Light, FINI(ON,OFF), 
Clock, P, 


2.5.3 Sequence chart 


The conversion sequence is described in the sequence chart in Fig. 2.15. As 
shown in the figure, when switch START is turned to the ON position, light FINI is 
turned to the OFF condition, and counter C and register D are both reset to 0. The 
4-bit input of Gray code is then transferred into register A. The logical EXCLUSIVE- 
OR of bits A(4) and D (1.е., G,@0) is next stored іп bit A(4), and counter C is incre- 
mented. Casregister D-A is circularly shifted one bit position to the left. Counter C 
is then tested for 4. If C is not 4, the logical EXCLUSIVE-OR, the shifting, the 
counting, and the testing are repeated until counter C reaches 4. At this time, cas- 
register D-A is circularly shifted one bit to the left once more so that the binary 
code which has just been obtained is stored in register A. Light FINI is turned to the 
ON condition and the conversion is now completed. 


2.5.4 Sequence Description 


The conversion operations shown in the sequence chart are now described by 
the following execution statements: 


Comment, here begins the conversion sequence (2.23) 
/START(ON)/ C<—0, D—90, FINI—OFF, T<—20,, 

[T(1)« P/ A<IN, T<cir T, 

[T(2)*P/ А(Ағу-А(ҘОІР, Те-сіг Т, 

/T(3)*P/ D-A-—cil D-A, C<—countup C, T<—cir Т, 


[T(4)*P/ IF (C=4) THEN (Те-сіг Т) ELSE (T(2,4).—10,), 


Sec. 2.6 А Binary-to-decimal Converter 61 


START(ON) 


D<0, 
C+0, 
FINI-OFF, 


A(4) -A(4)9D 


D-A<cil D-A, 
С-соитир C, 


D-A<«cil D-A, 
ҒІМІ-ОМ, 


Епа 


Fig. 2.15 Sequence chart for а serial Gray-to-binary code con- 
verter 


/T(5)*P/ D-A«cil D-A, ЕІМІ--ОМ, T—0, 
END 


2.6 A Binary-to-decimal Converter 


Conversion between binary numbers and decimal numbers is common. This 
organization describes a converter which converts a 10-bit binary integer into a 
3-digit decimal number. 


2.6.1 Binary-to-decimal Number Conversion 


The algorithm for converting a binary number to a decimal number is well known 
(7). To be brief, for an n-bit binary integer, the conversion begins at the most signifi- 
cant bit. This bit is either doubled (i.e., multiplied by 2) or dabbled (i.e., multiplied 


62 Chap. 2 SOME ORGANIZATIONS 


by 2 and then added by 1), depending on whether the next bit at the right is 0 or 1, 
respectively. The result is again doubled or dabbled, depending again on whether 
the next bit is 0 or 1. The process of doubling or dabbling continues (п — 1) times for 
n bits of the given binary number. 

For example, let the binary number be 1010110101. The conversion which begins 
with the most significant bit 1 is shown below: 


Step 

1 Ox2+1= 1 
2 1x2+0= 2 
3 2x2+1= 5 
4 5x2+0= 10 
5 10x2+1= 21 
6 21x2+1= 43 
7 43х2 +0 = 86 
8 86 х2 +1 = 173 
9 173 х 2 + 0 = 346 
10 346 x 2 + 1 = 693 


As shown above, the result is decimal number 693. 


2.6.2 Configuration 


The converter described here converts a 10-bit binary integer into a decimal 
number which is a binary coded decimal number (BCD) (1.е., each decimal digit is 
binary-coded). The conversion takes place in the three 4-bit registers A, B, and C. 
At the completion of conversion, registers A, B, and C hold the decimal digits of the 
decimal number with weights 10°, 10!, and 102, respectively. Register О stores the 
10-bit binary number to be converted. Counter D counts the number of times, switch 
START initializes the conversion, and light FINI indicates the completion of the 
conversion. Register T, its associated decoder, and clock P generate the control 
signals. 

The configuration just described is shown in the block diagram of in Fig. 2.16, 
and is described by the following declaration statements: 


Comment, configuration of a binary-to-decimal converter (2.24) 
Register, А(4-1), $store 10° digit 

B(4-1), $store 10! digit 

C(4-1), $store 10? digit 

D(0-3), $counter 

Q(1-12), $store-the 10-bit input 

T(0—2), $control register 


Decoder, К(0-4)--Т, 


Sec. 2.6 A Binary-to-decimal Converter 63 


SHIFT command, 
COUNT command, 
TEST command, 
CORRECT command, 
LIGHT command, 


P 
Clock 


FINI(ON, OFF) 


START 


Fig. 2.16 Configuration for a binary-to-decimal converter 


Terminal, SHIFT=K(0), $shift command 
COUNT-K(1), $count command 
CORRECT=K(2), Scorrect command 


TEST=T(3), $еѕї command 
LIGHT-Kw(4), Slight command 
Light, FINI(ON,OFF), 
Switch, START(ON), 
Clock, Р, 


The following conversion process, however, requires а correction micro- 
operation. For this purpose, a special operator called cor is defined below. 


64 Chap. 2 SOME ORGANIZATIONS 


Operator, X«—cor Х(4-1), (2.25) 
[begin/ IF (Х--5--6--7--8--9) THEN (X —X add 3), 


End of operator. 


The above operator adds the contents of register X by constant 3 when it contains 
the value of 5, 6, 7, 8, or 9. 


2.6.3 Conversion Process 


The conversion (4) takes place in casregister C-B-A. It is described by the 
example of converting the binary number 1010110101 into the decimal number 693 
(1.е., 0110, 1001, 0011). Figure 2.17 shows every step of the conversion. As the first 
step, casregister С-В-А is reset to 0. The second step shows the casregister shifted 
one bit to the left, thus the most significant bit is now in register А. The third step 
performs a correction of adding 3 to registers A, B, or C if the contents of registers 
А, B, or C exceed 4 (1.е., 0100) respectively. Since registers А, B, and C contain 0000, 
0000, and 0001, respectively, no correction is needed in this step. The fourth and fifth 
steps merely repeat the second and third steps. Again, no correction is needed in the 
fifth step. 

At the end of the sixth step, registers A, B, and C contain, respectively, 0000, 
0000, and 0101. Since the contents of register C exceed 4, a correction of adding 
0011 to register C is required; this correction is carried out during the seventh step. 
During the eighth step, the casregister is again shifted, leaving registers A, B, and C 
containing 0000, 0001 and 0000, respectively (1.е., BCD number 010). Had there been 
no add-3 operation during the seventh step, these registers would now contain 0000, 
0000, 1010, which is incorrect. 

This correction requires further explanation. When the binary number in registers 
A, B, or C is 10, constant 6 should be added to the contents of the register so that 
the carry is correctly generated, since decimal numbers are represented by BCD 
numbers. This correction, however, can be applied before the left shift; in this case, 
the correction is to add 3 instead of 6. 

At the completion of the 14th step, registers A, B, and C contain respectively, 
0000, 1000, and 0110. Hence, the contents of registers B and C are 8 and 6. As both 
of them exceed 4, each is added by 3 in the 15th step. The leftshift operation is again 
performed in the 16th step, leaving registers A, B, and C to contain 0001, 0111, and 
0011 which represent decimal number 173. 

The above described leftshift and correction (or no correction) micro-operations 
are repeated until all the bits in register О enter register А. At this time, the casregister 
contains 0110, 1001, 0011 which represents the correct answer 693. 


2.6.4 Sequence Descriptions 


The conversion process stated above can be described by the sequence chart in 
Fig. 2.18. The sequence chart shows a loop where four micro-operations are perform- 


O O e “e 


< 


1010110101 
| 
| 

г 0 

Ll 0 

l 0 | 

l 0 | 

Ll 0 Го 

l 0 l 0 

г 0 l 0 | 

| 0 | 0 | 

l 0 г 0 L L 
l 0 Го | | 
0 l 0 | Го 
0 | 0 | г 0 
Го | L 0 L 
l 0 | t 0 | 


Jaquinu Аеш битшоэч | 


55әзокі uois1eAuo2 eu јо sdeis eui бшллоцѕ ajdwexauy pz ‘B14 


0 
0 
| 
0 
0 
0 
| 
0 
0 
0 
0 
0 
0 
0 
L 


oO Oo O O O QO=ʻ= - - о - 


v 291516әң 


oo O O O 0 О © О О О = о 


— 


= О О О О О =©= = О о о = 


— 


-- О О О =© - = ООО 


g 2ә151бәнң 


Э 29151бәң 


(0010 791) по) pasoxa jou 
сәор Jequunu Алеша eui 
и pepeeu si ио112Ә.102 ON + 


ИШЕ 
'v ul £ ppe 
WUS 
'g u! £ ppe 
WIYS 
'8 pue 'y ul $ ppe 
ug 
4 4013992109 OU 
Mug 
4. 4013991105 OU 
tug 
+ 4011994405 OU 
lus 
V Ul $ ppe 
WUS 


1 0011991202 OU 


195 
+ 0011991109 OU 


195 


1Ә5ӘҢ 


40321990 


0с 
61 
81 
Ll 
91 
SL 
РІ 
£I 
cl 
LL 


dais 


65 


66 Chap.2 SOME ORGANIZATIONS 


START(ON) 


А<0, 
B<0, 
C<0, 
D<10, 

FINI-OFF, 


C-B-A-Oshl С-В-А-О 


A<cor А, 
B —cor B, 
C cor C, 


FINI<ON, 


End 


Fig. 2.18 Sequence chart for a binary-to-decimal converter 


ed, namely, leftshift, count, test, and correction. The binary number is assumed initially 
in register Q. This sequence is now described by the following execution statements: 


Comment, here begins the conversion sequence (2.26) 
/START(ON)  A—0, B—0, C—0, D—0, FINI—OFF, T—0, 

/SHIFT«P/ C-B-A-Q-—cil С-В-А-О, T—1, 

/COUNT#P/ D-—countdn D, T<2, 

/TEST«P/ IF (D=0) THEN (T<4) ELSE (Т3), 


/CORRECT#P/ А--сог A, Ве«-сог В, C—cor С, T—0, 
/LIGHT*P/ FINI—ON, 
END 


The above cor is the special operator defined by statements (2.25). 


Sec. 2.7 Stored Carry Addition 67 


2.7 Stored Carry Addition 


Mercer [2] proposed parallel addition of two unsigned binary numbers by a 
special parallel adder. This organization implements Mercer’s addition and its 
extensions to a zero test and a comparison test. 


2.7.1 Mercer's Addition Algorithm 


A half-adder has two inputs (X and Y) and two outputs (CARRY and SUM). 
SUM is the logical EXCLUSIVE-OR of X and Y, and CARRY is the logical AND 
of X and Y. Let the augend and the addend be n-bit positive binary integers. Mercer's 
algorithm contains a series of steps of additions by a parallel adder which has п half- 
adders (instead of full adders). During the first step, each pair of the augend bit and 
the addend bit is added simultaneously by one of the n stages of half-adders; this 
addition results in п sum bits and л carry bits. Both the sum bits and the carry bits 
are stored and then added during the second step; this addition again results in n 
sum bits and и carry bits. The sum bits and carry bits are again stored and added 
during the next step. This addition process goes on until no carry bit occurs; at that 
time, the addition is completed and the sum is the answer. 

For n-bit numbers, the maximum time for addition by Mercer's algorithm is 
the time for n steps, but the average time of addition is much less. Burks and others 
[1] reported that the average time of addition is the time for 4.6 steps if n is 40. Mercer 
claimed that the average addition time for his algorithm is probably less because the 
average time of 4.6 steps was for addition of full-length numbers chosen at random, 
while some additions in his sequence are simple and short. Furthermore, the time for 
each step of addition by a parallel adder consisting of only half-adders is shorter 
because the carries are stored after each addition instead of being allowed to propa- 
gate. Mercer's adder is asynchronous because the exact time to complete the addition 
of two numbers is not known beforehand. 


2.7.2 Statement Description 


Let the two unsigned numbers be 10 bits and let them be initially stored in 
registers А and М. Register OV stores the carry from the most significant stage of 
the parallel adder during addition. A special logic network is provided to give an 
output called LOGSUM. This output indicates presence of one or more carries from 
all the stages of half-adders except the most significant stage. Lights LTOV and FINI 
indicate overflow and the completion of addition respectively. Register T, its attached 
decoder, and clock P generate control signals. Switch START initiates the addition 
sequence. These elements are shown in the block diagram of Fig. 2.19. 

The sequential operation of the stored carry addition is described in the sequence 
chart of Fig. 2.20. As shown, after initialization by the START switch, the logical 
sum (i.e., logical EXCLUSIVE-OR) of the respective bits of registers A and N 


68 Chap.2 SOME ORGANIZATIONS 


LOGSUM 


Control 
signals 


FINI(ON, OFF) (on) LTOV(ON, OFF) 


Fig. 2.19 А configuration of Mercer's multiple step addition 


(sum bits of the half-adders) are transferred to register A; these are the sum bits from 
the half-adders. The carry bits of the addition are transferred to registers N and OV; 
these are the carry output bits of the half-adders. At the same time, the signal on 
terminal LOGSUM is transferred to register D which is then tested for 0. If register 
D does not contain 0, the above three micro-operations are repeated until it does. 
When register D contains 0, it means that no carry has been generated during the 
last step of addition. Register OV is next examined. If it contains 1, light LTOV is 
turned to the ON position. In any case, light FINI is turned ON to indicate the com- 
pletion of addition. 

The configuration and the sequence of Mercer’s stored carry addition are now 
described by the following statements: 


Sec. 2.7 Stored Carry Addition 69 


START (ON) 


LTOV-OFF, 


А+ AON, 
N-C(1-9)-0, 

D- LOGSUM, 
OV<OV+C(0), 


LTOV- ON 


End 


Fig. 2.20 Sequence chart for Mercer's multiple addition 


Comment, configuration for stored carry addition (2.27) 
Register, А(1-10), $augend register 

N(1-10), $addend or carry register 

T(0—1), $control register 

D, Фсаггу indicator 


OV, $overflow indicator 


70 Сһар. 2 SOME ORGANIZATIONS 


Switch, START(ON), $start switch 
Light, LTOV(ON,OFF), Soverflow light 
FINI(ON,OFF),  $completion light 
Terminal, С(0-9)-- А(1-10)«М(1-10), 
LOGSUM —C(1)4-C(2)4- C(3)4- С(4) + С(5) + C(6)-- C(7) 
+C(8)+C(9), 
Decoder, K(0-3)=T, 
Clock, Р 


Comment, here begins the addition sequence 
/START(ON)/ D-—0, OV-—0, Т<0, ЕІМІ:--ОҒЕ, LTOV<—OFF, 
/K(0)«P/ A«—ACN, 
N«—C(1-9)-0, 
OV-—OV 4- C(0), 
N«—C(1-9)-0, 
OV—OV 4- C(0), 
D«—LOGSUM, T—1, 
/K(1)*P/ IF (D=1) THEN (D<—0,T<—0) ELSE (IF (OV—1) THEN 
(T—2) ELSE (T<3)), 
/K(2)*P/ LTOV<—ON, Т<-3, 
/K(3)*P/ FINI<-ON, 
END 


2.7.3 Description of a Zero Test 


The above description can be further extended to implement a test sequence 
which tests if a given number is zero. This test is based on the logic that the addition 
of a number by itself generates at least one carry unless the number ts 0. 

This description employs the same configuration except for the need of register 
ZERO. The number to be tested is placed in register A. The contents of register 
A are then transferred to register N. Addition is next performed and the signal 
on terminal LOGSUM is also transferred to register D. If register D contains a 0, 
the number in register A 1s 0 and,register ZERO 15 set to | to indicate the result of 
zero. The test is completed. The statement description of the zero test is shown 
below: 


Comment, here begins zero test sequence (2.28) 
Register, ZERO, 


Sec. 2.8 А Bowling Score Computer 71 


/START(ON)/ D<—0, T-—0, 


/K(0)«P/ МА, T—countup T, 

/K(1)«P/ D«-LOGSUM, T<countup Т, 

/K(2)*P/ IF (D=0) THEN (ZERO<1), T<countup T, 
END 


2.7.4 Description of a Comparison Test 


The description for the stored carry addition may also be extended to carry 
out an equality test which tests if two members are equal. This test is based on the 
logic that the logical OR of two binary numbers is zero if and only if the two numbers 
are identical. 

This description employs the same configuration as that for the zero test sequence. 
The two unsigned binary numbers are stored in registers A and N. To compare the 
two numbers, the logical OR of the two numbers is obtained by the special parallel 
adder and stored in register A. Then it proceeds to the zero test for the result in 
register A. The statement description of the comparison test is now shown below in 
the following: 


Comment, here begins the comparison test sequence (2.29) 
Register, ZERO, 
Decoder, K(0-4)=T, 
/START(ON) 0-0, FINI<OFF, Т<-0, 
/K(0)*P/ A<AQN, T<countup T, 
/K(1)«P/ МА, Те соипшр Т, 
/K(2)*P/ D-—LOGSUM, T-—countup T, 
/K(3)«P/ IF (D=0) THEN (ZERO- 1), T—countup Т, 
/K(4)«P/ FINI—ON, 
END 


2.8 A Bowling Score Computer 


The ten-pin bowling game has been a popular game, and the algorithm for com- 
puting the score is quite well known. This organization describes a computer which 
computes the score of a player as the score of each ball comes in. 


2.8.1 Rules for Playing a Bowling Game 


]n a ten-pin bowling game, the ten pins are set up in a triangular pattern at one 
end of an alley and a ball is rolled toward them from the other end to knock down 


72 Chap.2 SOME ORGANIZATIONS 


as many pins as possible. The rolling of one or two balls constitutes a frame. After 
each frame, the ten pins are set up again. A game consists of ten frames. 

The score of a ball is the number of pins that are knocked down. If a player 
knocks down all ten pins with the first ball, it is called a strike. Such a ball constitutes 
a frame. The score for a strike is the ball score (i.e., 10) plus a bonus which is the score 
of two succeeding balls. If he knocks down all ten pins with two balls, it is called a 
spare; these two balls constitute a frame. The score for a spare is the sum of the two 
balls (1.е., 10) plus a bonus which is the score of one succeeding ball. If he does not 
knock down all ten pins with two balls, these two balls also constitute a frame; the 
score is the sum. of the two balls (1.е., 0 to 9) without a bonus. For each frame, по 
more than two balls can be thrown. Occurrence of a strike or spare in a frame will 
be referred to as the status of the frame. 

For each ball, there is a ball score. For each frame, there is a frame score. The 
frame score is computed from the ball scores of the frame and the bonus for the strike 
or spare. After a ball is rolled, the score which can be computed from the ball score 
and the frame status is called the accumulated score. The score of a game is the sum 
of the scores of the ten frames, and is the accumulated score at the end of the game. 
The game ends when enough balls are rolled for computing the score of the tenth 
frame. Since the bonus for the strike or the spare is determined by the ball scores 
of the next frame or frames, one or two extra balls are rolled, if needed, in the case 
of the last frame. The minimum number of balls rolled in a game is 12, while the 
maximum is 21. The minimum score is 0 which occurs when no ball knocks down 
any pin, while the maximum score is 300 which occurs when each ball knocks down 
all ten pins. The game is played by two or more players. The one with the highest 
final score is the winner. 

The example of computing the score of a bowling game is shown in Table 2.4. 
For convenience, the score of a bowling game is sometimes called the final score. 


TABLE 2.4 Frame Scores of a Bowling Game 


BALL SCORE FRAME SCORE 
FRAME Ist 2nd FRAME STATUS No Bonus Bonus WITH BONUS 
1 9 1 Spare 10 10 20 
2 10 - Strike 10 0+9 19 
3 0 9 Two balls 9 - 9 
4 10 - Strike 10 3+7 20 
5 3 7 Spare 10 3 13 
6 3 3 Two balls 8 - 8 
7 10 - Strike 10 10+ 4 24 
8 10 - Strike 10 4+5 19 
9 4 5 Two balls 9 - 9 
10 8 1 Two balls . 9 - 9 


FINAL SCORE 150 


Sec. 2.8 A Bowling Score Computer 73 


Table 2.4 shows the ball scores, the frame status, the frame scores with and without 
bonus, and the final score. As shown, there are four strikes and two spares. The 
final score is 150. 


2.8.2 Computation of Bowling Score 


One way to compute the score of a bowling game is to compute the score of 
each frame and then add the scores of the ten frames to give the final score. An 
example is shown in Table 2.4 where the frame score with and without a bonus 
and the bonus is also shown. Computation of a frame score is influenced by 
the status of the current frame and the next frame. There are six frame statuses of 
interest : 


. two balls, 

. Spare, 

. Strike, 

. spare followed by a strike, 


. Strike followed by a spare, and 


Nn fF WN н 


strike followed by a strike. 


It should be noticed that computation of a frame score with a strike or a spare 
requires the ball score (or scores) of the next frame (or frames). To compute the 
up-to-the-frame score of a player requires the computation of the accumulated score 
as each ball score comes in. The final score is the accumulated score at the end 
of the game. An example is shown in Table 2.5, where the ball scores (and thus the 
final score) are the same as those in Table 2.4. As shown, for a given ball score, the 
accumulated score requires one, two, or three additions of the ball score, depending 
on the frame status. For example, one addition is sufficient for the first or the second 
ball score of a two-ball frame (see ball nos. 15 and 16 in Table 2.5). However, two 
additions of the first ball score are required if the status of the preceeding frame is a 
spare (see ball nos. 7, 8, and 9), and two additions for each of the two ball scores are 
required if the status of the preceeding frame is strike (see ball nos. 3, 4, and 5). A 
similar situation exists for a strike followed by a spare (see ball nos. 6, 7, and 8) or a 
spare followed by a strike (see ball nos. 1-5). In case a strike is followed by another 
strike (see ball nos. 11 and 12), three additions of the score of the first ball of the next 
frame are required. 

Termination computation of the accumulated score requires determination of 
zero, one, or two extra balls. Examples of extra balls for the 10th frame are shown 
in Table 2.6 where all six cases of frame status are included. As shown, no extra ball 
is required when the frame status is two-ball. One extra ball is required when it is 
spare, and two extra balls are required for all the remaining four cases. Each score of 
these extra balls is to be added once to the accumulated score except the first ball 
score of the strike-strike case; this ball score is to be added twice. 


74 Сһар. 2 SOME ORGANIZATIONS 
TABLE 2.5 Accumulated Scores of a Bowling Game 

BALL BALL FRAME FRAME ACCUMULATED SCORE 

Number SCORE NUMBER STATUS (а (b (с) (d) (е) 

1 9 1 0+ 9+—-+—=9 

2 1 I Spare 9+ 1+—+—=10 

3 10 2 Strike 10+ 10+ 10 + — = 30 

4 0 3 30+ 0+ 0+—=30 

5 9 3 Two-ball 30+ 9+`9+— = 48 

6 10 4 Strike 48+10+—+—=58 

7 3 5 58+ 34+ 34+—= 64 

8 7 5 Spare 6446+ 7+ 7+ — = 78 

9 3 6 78 + 3+ 3 + — = 84 

10 5 6 Two-ball 84+ 54+—-+—= 89 

11 10 7 Strike 89 + 10 + — + — = 99 
12 10 8 Strike 99 + 10 + 10 + — = 119 

13 4 9 119 + 4+ 4+ 4= 131 
14 5 9 Two-ball 131+ 5+—+ 5--141 
15 8 10 141+ 8+—+—= 149 
16 1 10 Two-ball 149+ 1+—+—= 150 


Note: (a) Initial value of the accumulated score 
(b) Current all score 
(c) Bonus due to strike or spare 
(d) Bonus due to strike-strike 
(e) New value of the accumulated score 


TABLE 2.6 Examples of Extra Balls for the 10th Frame 


BALL SCORES 


FRAME STATUS 9th frame 10th frame ExTRA BALLS 
Two-ball -- 3,5 
Spare — 4, 6 3 
Strike — 0 3,5 
Spare-strike 4, 6 10 3,5 
Strike-spare 10 4, 6 3,5 
Strike-strike 10 10 3,* 5 


*This ball score is to be added twice. 


2.8.3 Configuration 


The configuration for a bowling score computer is shown in the block diagram 
of Fig. 2.21. This configuration does not include those elements for manual insertion 
of each ball score by the player. It is assumed that, when the ball score is available 
at the input terminals IN(0-3), light INPUT is at the ON condition to indicate its 


availability. 


Sec. 2.8 А Bowling Score Computer 75 


Ball score inputs 


IN(0-3) 
P 


SC(0-8) 
Parallel adder 


SP F10 


Control 


signals 
WAIT(ON, OFF) 


INPUT (ON, OFF) 


Fig. 2.21 Configuration of the bowling score computer 


Register A is the input register where each ball score is received. B is ball number 
counter, and FC is frame number counter. When a frame has two balls, the score of 
the first ball is temporarily stored in register D, and the score of the second ball is 
stored in register A. Addition is performed by the parallel adder which takes the 
operands from registers SC and A and stores the sum in register SC. Register SC 


76 Chap. 2 SOME ORGANIZATIONS 


stores the accumulated score and thus contains the final score when the computation 
is completed. 

When it is 1, register ST indicates a strike, register SP indicates a spare, register 
FB indicates the first ball, and register F10 indicates the last frame. When it is turned 
to the ON condition, light WAIT indicates that the computer is waiting for the next 
ball score, light INPUT suggests that the next ball score is available, and light OVER 
indicates that the game is over. Switch START starts the operation of the computer. 
Switch COMPUTE commands the start of the computation as well as the continu- 
ation of computation after waiting for the next ball score. Register T, decoder with 


terminals K’s, and clock P generate control signals. 
The above configuration is now described by the following statements: 


Comment, configuration of a bowling score computer 


(2.30) 


Register, В(0-4), $ball number counter 
А(0-3), $input register 
ЕС(0-3), $frame number counter 
5С(0-8), $score register for the accumulated score 
D(0-3), $first-ball-score register 
T(0-2), $control register 
ST, $strike indicator (when 1) 
SP, $spare indicator (when 1) 
FB, $first-ball indicator (when 1) 
F10, $tenth-frame indicator (when 1) 
Terminal, IN(0-3), $inputs where the ball score appears. 
Decoder, К(0-12)--Т, 
Switch, START(ON), $manual command to start 
COMPUTE(ON), $manual command to compute 
Light, WAIT(ON,OFF), Swhen ОМ, it indicates waiting for ball score. 
INPUT(ON,OFF), S$when ON, it indicates arrival of ball score. 
OVER(ON,OFF), $when ON, it indicates the game is over. 
Clock, P 


2.8.4 Sequence Charts 


Sequence charts which describe sequential operations of the bowling score com- 
puter are shown in Figs. 2.22 and 2.23. The sequence chart in Fig. 2.22 does not include 
termination computation for clarity while the one in Fig. 2.23 does. 

As shown in Fig. 2.22, when switch START is first turned to the ON position to 


START(ON) 


WAIT=ON m 


Cem) 


B<countup B 
WAIT<ON 


WAIT-OFF 


FC<countup FC 


Fig. 2.22 Sequence chart for computing bowling score (no 
termination computation) 


COMPUTE 
(ON) 


77 


78 


START(ON) 


R-1, WAIT-ON, 5Т<0, 
SC-0, INPUT-OFF, $Р- 0, 
FC-0, OVER<OFF, FB- 1, 

Р10<0, 


B<countup В 
WAIT<ON 


WAIT<OFF 


A*IN, INPUT-OFF 


OFF 
SC--SC add 0-0-0-0-0-А 
21 
SC-SC add 0.0.0.0.0.A Ей 


#1 
F10=0 
=1 = 
5С<5С add 0-0-0-0-0-А SP«ST 
p my +1 
SP<ST, 
57<0, 
D-a, |2 ТАУ. ОР = 
А=10 ЕВ=0 (А ада 0)=10 Ж spt 


5Т<1, 
FC--countup FC 
FC=10 F10<1 


СОМРОТЕ(ОМ) 


SP<ST, 


Fig. 2.23 Sequence chart for computing bowling score 


M 


бес. 2.8 А Bowling Score Computer 79 


initialize computer operation, light WAIT is set to the ON condition. The computer 
now waits for the arrival of the ball score. When the ball score arrives at the input 
terminals, light INPUT is set to the ON condition. As a result, the operator of the 
computer turns switch COMPUTE to the ON position which sets light WAIT to 
the OFF condition, thus beginning the computation of the accumulated score of the 
incoming ball score. The input data is first accepted into register A, and light INPUT 
is turned to the OFF condition. If either register ST or register SP contains 1, the 
ball score is added once more. If registers ST and SP both contain 1, the ball score is 
added the third time. At this time, register ST is reset to 0, but register SP is set to 
1 only if the last frame has a strike. Register FB is next tested to determine whether 
the current ball score is the score for the first ball or the second. If it is the first ball’s, 
it is next determined whether a strike has occurred. If a strike has occurred, register 
ST is set to 1 and frame counter FC is incremented by 1; otherwise, the first ball 
score in register A is temporarily stored in register D and register FB is set to 0 to 
indicate that the next ball is a second ball. If it is the second ball’s, it is next deter- 
mined whether a spare has occurred by testing if the sum of the contents of registers 
A and D are 10. If the sum is 10, register SP is set to 1; this is followed by setting 
register FB to 1 and by incrementing frame counter FC by 1. At this time, the com- 
putation for the current ball score is completed. Ball number counter B is next incre- 
mented by 1 and light WAIT is turned to the ON condition to wait for the next 
ball score. 

When termination computation is incorporated, the sequence chart in Fig. 2.22 
becomes the sequence chart in Fig. 2.23. Termination computation requires the 
recognition of the last frame and then the decision whether one or two extra ball 
scores are needed. If they are needed, the accumulated score for the extra ball or balls 
is computed, while the frame counter is kept from incrementing. As mentioned, one 
addition of the ball score for each of the extra balls is required except the score of 
the first ball of the strike-strike case. In this case two additions are required. When 
the final score is obtained, light OVER is turned to the ON condition to indicate 
that the game is over and the computation is completed. 


2.8.5 Sequence Description 


The computation algorithm shown in the sequence chart of Fig. 2.23 is now 
described by the following execution statements 


Comment, here begins computation of the accumulated score (2.31) 

/START(ON)/ B—1, SC—0, FC—0, Е10<0, ST—0, SP<—0, FB—1, 
WAIT<ON, INPUT-—OFF, OVER-—OFF, T<0, 

/COMPUTE(ON)/ WAIT-—OFF, T<—0, 

/K(0)«P/ IF (WAIT=OFF) THEN (AIN, INPUT—OFF, T—1), 

/Ка)»*Р/ SC<SC add 0-0-0-0-0-А, T«—2, 


80 Chap. 2 SOME ORGANIZATIONS 
/K(2)*P/ IF (SP+ST) THEN (SC<SC add 0-0-0-0-0-А), Т<-3, 
[К (3)*Р/ IF (Е10=0) THEN (T-—4) ELSE (Т—10), 
[К (4)*Р/ IF (SP«ST) THEN (SC-—SC add 0-0-0-0-0-А), Т5, 
/ K(S)«P/ SP—ST, ST-—0, Т<6, 
/K(6)«P/ IF (FB=0) THEN (T-—7) ELSE (T<8), 
/K(7)*P/ IF ((A add D)=10) THEN (SP<-1), FB—1, FC<—countup 
ЕС, Т9, 
/K(8)«P/ IF (A=10) THEN (5Т--1, FC<-countup FC, T<9) 
ELSE (D—A, FB—0, T—11), 
/K(9)«P/ IF (ЕС--10) THEN (F10—1, T-——10) ELSE (T—11), 
/K(10)«P/ IF (SP+ST=1) THEN (T—12) ELSE (SP<-ST, ST——0, 
Т<-11), 
/K(11)*P/ B-—countup В, WAIT-—ON, Т<-0, 
/K(12)«P/ OVER<ON, 
END 
References 
1. Burks, А. W., GOLDSTINE, H. H., and VoN NEUMANN, J., “Preliminary Discussion 


on the Design of an Electronic Computing Instrument,” Institute for Advanced Study, 
June, 1946. 


. MERCER, В. J., “Micro-Programming,” J. of the ACM, April, 1957, pp. 157-171. 
. CouLEUR, J. F., "BIDEC-A Binary-to-Decimal or Decimal-to-Binary Converter,” 


IRE Transaction on Electronic Computers, December, 1958, pp. 313-316. 


. Croy, J. E., "Rapid Technique of Manual or Machine Binary-to-Decimal Integer Con- 


version Using Decimal Radix Arithmetic,” IRE Transaction on Electronic Computers, 
December, 1961, pp. 777. 


. WILSON, M. C., “Gray to Binary Converter,” Instruments & Control Systems, June, 1962, 


pp. 149-150. 


. Rozier, С. P., “Decimal-to-Binary Conversion Using Octal Radix Arithmetic,” JRE 


Transaction on Electronic Computer, October, 1962, рр. 708-709. 


. Cau, Y., Digital Computer Design Fundamentals. New York: McGraw-Hill Book 


Company, 1962, pp. 13-14; 143-145. 


‚ “Ап Algol-like Computer Design Language,” Comm. of ACM, October, 1965, 
pp. 607-615. 


Problems 81 


10. 


. Сно, Y., and MEszTENyI, C. K., "Macro Logic Design of Digital Computers,” Technical 


Report 67-57, Computer Science Center, University of Maryland, November, 1967. 


Кмотн, D. E., “The Art of Computer Programming." Vol. 1. Reading, Massachusetts: 
Addison-Wesley Publishing Co., Inc., 1968, pp. 143-145. 


. VAN TASSEL, D., “The Chinese Remainder Theorem," Computer Decisions, November, 


1969, pp. 69. 


. SINGLETON, В. C., “А Prime Number Generator using the Treesort Principle, Algorithm 


356," Communications of the ACM, October, 1969, pp. 563. 


, "An Efficient Prime Number Generator, Algorithm 357," Communications of the 
ACM, October, 1969, pp. 563. 


Problems* 


2.1. 


Simulate the serial parity generator as described by statements (2.7) and (2.8). 


2.2. Generation of a parity bit described by statements (2.7) and (2.8) can be speeded up if 


2.3. 
2.4. 
2:3: 


2.6. 


2.7. 
2.8. 
2.9. 


bits A(5, 6) and register РВ are examined at the same time and if register А is shifted 
to the right in two-bit positions. Draw a sequence chart and describe the new parity 
generator by CDL statements. 


Simulate the serial comparator as described by statements (2.9) and (2.10). 
Simulate the serial comparator as described by statements (2.9) and (2.11). 


If the two numbers in the signed 2’s complement representation are compared, draw a 
sequence chart and describe the algorithm by CDL for determining whether one is 
greater than, equal to, or smaller than the other. 


Simulate the sequence which finds the largest number among л integers described by 
statements (2.12). 


Repeat Problem 2.6 for the description іп (2.13). 
Simulate the prime number generator described by the statements (2.16) and (2.17). 


What are the advantages and disadvantages if division is provided in the prime number 
generator instead subtraction of and counting? 


2.10. Simulate the Gray-to-binary code converter described by statements (2.22) and (2.23). 


2.11 


‚ Simulate the binary-to-decimal converter described by statements (2.24), (2.25), and 
(2.26). 


2.12. If register О is eliminated in the binary-to-decimal converter in Fig. 2.16, what changes 


lang 


should be made in the sequence chart of Fig. 2.18 assuming that the 10-bit input is 
initially stored in casregister С-В-А ? 


*Those problems where simulation is required refer to the use of an algorithmic programming 
uage such as Fortran, Algol, or PL/1. 


82 


2.15. 


2.14. 
2.15. 
2.16. 
2.17. 


Chap. 2 SOME ORGANIZATIONS 


Given a 3-digit BCD integer, describe a converter which converts the BCD integer into 
a binary integer. 

(a) Show a configuration block diagram. 

(b) Draw a sequence chart. 

(c) Describe the sequence by CDL statements. 


Simulate the stored carry addition described by statements (2.27). 
Simulate the zero test sequence described by statements (2.28). 
Simulate the comparison test sequence described by statements (2.29). 


Simulate the bowling score computer described by statements (2.30) and (2.31). 


The type of control that sequences the micro-operations of the organizations in 
Chapter 2 is known as sequential logic control. It is thus called because sequences 
of control signals are generated by a logic network. In 1951, a different type of 
control called microprogram control (2, 5, 8) was proposed by M. V. Wilkes of 
England (1). The idea of microprogram control is to replace the control logic 
network with a program and store the program in a memory which needs only be 
a read-only memory. This memory is often called the contro/ memory. The program 
stored in the control memory is called the microprogram. The task of preparing 
the microprogram is called microprogramming. When a digital computer employs 
the microprogram control, it is called a microprogrammed computer. 

This chapter introduces the concept and principle of microprogram control 
and microprogramming by presenting four examples. The first example introduces 
microprogram control and examines the difference between sequential logic con- 
trol and microprogram control. The second example describes a microprogrammed 
computer, and the third example a stored logic computer. The last example deals 
with the additional consideration of timing among the main memory cycle, the 
control memory cycle, and the clock cycle. 


Microprogramming 


3.1 А Parity Generator 


A serial parity generator was described in Chapter 2. The configuration of the 
generator was shown in the block diagram of Fig. 2.1, the sequence was described in 
the chart of Fig. 2.2, and the complete description was presented in statements (2.7) 
and (2.8). The generator in Chapter 2 makes use of sequential logic control, while 
the generator to be described here utilizes microprogram control. 

In statement description (2.8), control register T generates the control sequence. 
Each bit of register T is assigned to command one or two micro-operations. Bit T(1) 
commands the EXCLUSIVE-OR and count micro-operations. Bit T(2) commands 
the circulate-liftshift micro-operation. Bit T(3) commands a conditional branching 
micro-operation. And bit T(4) turns on light FINI. These five micro-operations, 


РВ--А(6) БОРВ, (3.1) 
C«—countup С, 

А--сіг А, 

IF (C=6) THEN (...) ELSE (...), 

FINI<-ON, 


are again required in the case of microprogram control as they perform the function 
of generating the parity bit. 
3.1.1 Microprogram Control Configuration 


Figure 3.1 shows the configuration for the serial parity generator by micro- 
program control. This configuration is described by the statements, 


Comment, microprogram control configuration of parity generator (3.2) 
Register, А(1-6), $shift register 

C(0—2), $counter 

PB, $parity bit register 

H(0—2), $address register 

F(0—9), $buffer register 


Memory, CM(H)=CM(0-7,0-9) $control memory 
Switch, | START(ON) $start switch 


Sec. 3.1 A Parity Generator 85 


CON түйені 


Control memory 
СМ(0-7, 0-9) 


Logic 
network 


Control signals 


Fig. 3.1 Configuration for a serial parity generating sequence 
by microprogrammed control 
Light, FINI(ON,OFF) $completion indicator 
Clock, Р(1-2) $two-phase clock 


In the configuration in (3.2), control register T is replaced by control memory CM 
together with address register H and buffer register F. Furthermore, the two-phase 
clock is used instead of the single-phase clock. 


3.1.2 Control Word Format 


In addition to the five micro-operations in (3.1), three more micro-operations 


F—CM(H), (3.3) 
H-«—countup H, 
IF (С--6) THEN (H-—countup Н) ELSE (H<-F(0-2)), 


are required due to the use of the control memory. The first micro-operation reads 
a Word out of the control memory and transfers it to the buffer register F. The second 
micro-operation increments register H by 1. The third is a conditional micro-opera- 
tion which tests and branches the sequence; this micro-operation replaces the similar 
one in statements (3.1). Thus, there is a total of seven micro-operations. 


86 Chap. 3 MICROPROGRAMMING 


Each word in the control memory is called a control word or a micro-instruction. 
The microprogram is the sequence of control words in the control memory. The 
format of the control word is shown in Fig. 3.2. There are 10 bits: Bit Е(0-2) is an 


> 

a 

& 

2 

= 

а 
PB-—A(6)ePB 
C<countup C 
A*-cir А 

о 

о 

3 

д. IF (C=6) THEN (Н=соипїир H} ELSE (Н=Е(0-2)) 

= 

о 


ҒІМІ-ОМ 


H<countup Н 


F<CM(H) 


Fig. 3.2 Control word format 


address field for the H register. Each of the remaining seven bits is assigned to one 
of the seven micro-operations. When bit F(j) is 1 where j is 3 through 9, the command 
signal for the jth micro-operation is generated. Since more than one Е()) bit can be 
1, more than one micro-operation can occur simultaneously. As an example, four 
control words are shown in Fig. 3.3, where the commas merely improve readability 


Memory 
address Control memory CM 


o [0.007.100.0181 
г [99990186911 


Lee 


Fig. 3.3 Microprogram in the control memory 


Sec. 3.1 A Parity Generator 87 


but do not actually exist. These four control words, as will be described, constitute 
the microprogram for the parity generating sequence. 


3.1.3 Sequence Description 


The sequential operation of the generator is prescribed in the sequence chart in 
Fig. 2.2. The control signals for prescribing the sequence are obtained as follows. 
Each micro-operation in the sequence chart is assigned the control bit according to 
the control word format. These micro-operations are activated during clock phase 
P(1) except that the micro-operation to fetch a control word from the control memory 
occurs during clock phase P(2). With these choices of control bits and clock phases, 
a control signal can be given to each micro-operation. The seven micro-operations 
can now be described by the following execution statements: 


/F(3)*P(1)/ РВ<А(6)ФРВ, (3.4) 
/F(4)*P(1)/ C<countup C, 

/F(5)*P(1)/ А--сіг A, 

[Е(6)*Р(1)/ IF(C=6) THEN (H<countup H) ELSE (H-—F(0-2)), 

/[Е(7)*Р(1)/ FINI<ON, 

[Е(8)*Р(1)/ H«-countup H, 

/Е(9)*Р(2)| F—CM(H), 


With the above execution statements, the sequence in Fig. 2.2 can now be 
described as below. 


Comment, start the parity generating sequence (3.5) 
[ЗТАВТ(ОМ)/ PB—l1, H—0, C——0, FINI—OFF, Fel, 
[F(9)«P(2)/ F-—CM(H), 

Comment, generate the current parity bit 

/Е(3)*Р(1)/ PB<A(6)@PB, 

/Е(4)*Р(1)/ C<—countup C, 

/F(8)*P(1)/ H<countup H, 

/F(9)*P(2)/ F-—CM(H), 

Comment, circular rightshift 

/Е(5)*Р(1)/ А <сг A, 

/Е(8)*Р(1)/ H<countup Н, 

/Е(9)*Р(2)/ Е<СМ(Н), 


Comment, test and branch 


88 Chap. 3 MICROPROGRAMMING 


[Е(6)*Р(1)/ IF(C=6) THEN (H<countup Н) ELSE (H-—F(0—2)), 
[F(9)«P(2)/ F—CM(H), 
Comment, completion indication 
/Е(7)*Р(1)/ FINI<-ON 
END  ^- 


3.1.4 Microprogram 


The execution statements in (3.5) are divided into five groups. Headed by a 
comment statement, each group carries out the micro-operations in one of the five 
blocks in the sequence chart of Figure 2.2. The first comment statement refers to the 
group of micro-operations which initialize the execution of the sequence when switch 
START is turned to the ON position. These micro-operations set register PB to 1, 
reset registers H and C to 0, turn light FINI to the OFF condition, and set register 
F to 0001, which is the code to command the micro-operation of reading a word out 
of the control memory. The second comment statement refers to the group of four 
micro-operations specified by the first control word 0143, in the control memory 
shown in Figure 3.3. This control word consists of four 1’s which command these 
four micro-operations, namely, set parity bit, increment C, increment H, and read 
the control memory. The third comment statement refers to the group of micro- 
operations specified by the second control word 0023, in the control memory. This 
control word consists of three 1’5 which command these three micro-operations, 
circulate leftshift, increment H, and read the control memory. The fourth comment 
statement refers to the group of micro-operations specified by the third control word 
0011, in the control memory. This control word consists of two 1’s which command 
these two micro-operations: test-branch and read the control memory. The last 
comment statement refers to the micro-operations to turn light FINI to the ON 
position as specified by the fourth control word 0004, in the control memory. Thus, 
statements (3.5) describe the generating sequence by the control words of the micro- 
program in Fig. 3.3. 


3.1.5 Control Cycle 


When the microprogram is being executed, the operation of the parity generator 
follows the sequence chart in Fig. 2.2. The execution of each control word of the 
microprogram can be described by the control cycle chart shown in Fig. 3.4, where 
there is a control loop with as many branches as the number of micro-operations. 
During each cycle of the loop, one or more micro-operations specified by the control 
word are executed simultaneously during the first clock phase; and the next control 
word is read out of the control memory during the second clock phase. The next 
control word is located either by incrementing register H or by transferring the address 


START(ON) 


C+0, 

FINICOFF, 
Е<1, 

PB<1 


т 
о _ 
32 
So 
- Ш 
S nam 
Ir 
Z w 
со o ul о I 
а. а а J о 
® 2 e ш 2 8 
© Е < © © 
< а ұз T Ф 2 
2 5 < u- ш I 
я 
а. 


Fig. 3.4 Control cycle chart for the serial parity generating 
sequence 


F<CM(H) 


89 


90 Chap. 3 MICROPROGRAMMING 


field Е(0-2) to the H register. At present, the speed in executing this control loop 
limits the speed of microprogram controlled computers. 


3.1.6 Comparison 


The control in the configuration in Fig. 2.1 is implemented by the logic network 
associated with register T; its speed of operation is limited by the speed of logic 
circuitry. The control in the configuration of Fig. 3.1 is carried out by the micro- 
program stored in the control memory; its speed of operation is limited by the speed 
of the control memory. Since, in the current state of technology, the control memory 
operates considerably slower than the logic circuitry, microprogram control does not 
give the fastest speed of operation. On the other hand, the cost of sequential logic 
control is somehow proportional to the complexity of the control, while the cost 
of microprogram control is essentially determined by the cost of the memory and, 
within limits, is relatively insignificant to the complexity of the control. 

Thus, it is concluded that at present, microprogram control may not be advan- 
tageous in speed, but it offers flexibility and economy in more complex control. 


3.2 A Microprogrammed Computer 


Having introduced the microprogram control, a simple microprogrammed com- 
puter is now described. This is the same stored-program computer described in 
Chapter 1 except that microprogram control is instituted instead of sequential logic 
control. 


3.2.1 Configuration 


The configuration of the computer is shown in Fig. 3.5. The computer has a 
random access memory M with address register C and buffer register R. Memory 
M has a capacity of 32,768 24-bit words. There are program register D, arithmetic 
register A, start-stop control register G, and switches START, STOP, and POWER 
in addition to the three-phase clock P. For microprogram control, there is control 
memory CM with address register H and buffer register F. This configuration is now 
described by the following declaration statements: 


Comment, configuration of the microprogram-controlled computer (3.6) 
Register, К(0-23), $buffer register for memory М 
А(0-23), $arithmetic register 
С(0-14), Фааагеѕѕ register for memory M 
D(0-14), $program register 


Е(0-23), $Бийег register for memory CM 


Parallel adder/subtracter 


Control memory CM 


ON 


ON 
Logic 
vlad network 
ON 
Control 
signals 


Fig. 3.5 Configuration of a simple, microprogrammed computer 


91 


92 Chap. 3 MICROPROGRAMMING 


Н(0-9), $address register for memory СМ 
G, $start-stop control register 
READ, $initiate memory read operation 
Subregister, R(OP)—R(0—5), $op-code part of register R 
R(ADDR)=R(9-23), $address part of register R 
F(ADDR)=F(0-9), $address part of register Е 


Memory, М(О)-- М(0--32767,0-23) $main memory 
СМ(Н)--СМ(0-1023,0-23) $control memory 


Switch, POWER(ON), $power switch 
START(ON), $start switch 
STOP(ON), $stop switch 

Clock, Р(1-3), $three-phase clock 


3.2.2 Control Signals 


The main memory cycle is chosen to coincide with the clock cycle. There are 
three steps in a main memory cycle, each controlled by one clock phase. When a 
word is to be read out of the main memory, the transfer of the memory address to 
the address register and the initiation of the memory read operation both occur at 
the clock phase P(3) of the preceeding clock cycle. The word is available in register 
R at clock phase P(1) of the current clock cycle. Thus, we have, 


/P(3)/ C<R(ADDR) or C—D $Sbeginning of a main memory cycle (3.7) 
/P()/ R—M(C) 


/P(2)/ $end of a main memory cycle 


When a word is to be written into the main memory, the transfer of the memory 
address to the address register and the initiation of the memory write operation both 
occur at clock phase P(3) of the preceeding clock cycle. The word is written into the 
memory from register R at clock phase P(1) of the current clock cycle. Thus, we have, 


/P(3/ C-—R(ADDR)or C—D $beginning of a main memory cycle (3.8) 


[PQY 
/Р(2)/ M(C)—R $end of a main memory cycle 


The control memory is a read-only memory. Its memory cycle is chosen to 
coincide with the clock cycle in the following manner. The transfer of the memory 
address to the address register and the initiation of the memory read operation both 
occur at clock phase P(2) and the word is available in register F at clock phase P(3). 
Thus, we have, 


Sec. 3.2 А Microprogrammed Computer 93 


/Р(1)/ (3.9) 
/P(2)/ Н<-К(ОР) or H—F(ADDR) $Беріппіпе of а control memory 

cycle 
/P(3/ F—CM(H) Send of a control memory cycle 


The timing in (3.9) reads a control word out of the control memory at clock phase 
P(3) so that an execution sequence (such as ADD sequence and SUB sequence to be 
shown subsequently) can begin at clock phase P(1). 


3.2.3 Control Word Format 


Table 3.1 shows the format of the control word. The control word consists of 
24 bits. There are three fields. Field F(1-8) contains a control memory address for 


TABLE 3.1 Control Word Format 


CONTROL Bir CLOCK PHASE MICRO-OPERATIONS 
F(0—9) control memory address 
F(10) Р(1) R«—M(C) 
F(11) P(2) H<R(OP), D«—countup D 
F(12) P(3) IF (G) THEN (F-—CM(H), C<-R(ADDR)) 
ELSE (Н< 0, C<—0, D<—0, R<—0) 
F(13) P(2) H«—F(ADDR) 
P(3) C<—D, Е<-СМ(Н) 
F(14) Р(1) В< А 
Р(2) М(С)< К 
Е(15) Р(2) А<-А add R 
F(16) P(2) A«—A sub R 
F(17) Р(1) D«—R(ADDR) 
F(18) Р(1) IF (А(0)) THEN (D<-R(ADDR)) 
Е(19) Р(1) A<shr А 
F(20) Р(1) A<«—cil A 
F(21) P(1) А<0 
Е(22) Р(1) C—0 
F(23) Not used 


branching; field F(10-22) contains the control bits; and field F(23) is not used. The 
control bits together with the respective clock phase are assigned to the micro-opera- 
tions from the sequence chart of Fig. 1.5 and to the additional micro-operations 
required for the microprogram control. 


3.2.4 Sequence Description 


With the control word format in Table 3.1, the sequence chart of the computer 
in Fig. 1.5.can now be described by following execution statements: 


94 


Chap. 3 MICROPROGRAMMING 


Comment, sequences of the microprogram controlled computer (3.10) 

Comment, initialization 

/POWER(ON)/ G-<0, F—0, H—0, C—0, D—0, R—0, 

Comment, start operation 

І5ТАКТ(ОМ)  G-1, F(12)—1, F(0-11,13-22).—0, 

Comment, stop operation 

/STOP(ON)/ G0, 

Comment, fetch sequence when H=0 

/F(10)*P(1)/ R—M(Q), 

/Е(11)*Р(2)/ F-—R(OP), D-—countup D, 

/F(12)«P(3) ТЕ (G) THEN (F<-CM(H), C——R(ADDR)) 
ELSE (Н<-0, C—0, D——0, R—0), 

Comment, ADD sequence when H—1 

C/F(10«P(1) | R—M(C) 

[Е(15)*Р(2)/ A<A add К, 

[Е(13)*Р(2)/ H<F(ADDR), 

/F(13)*P(3)/ C-D, F—CM(H), 

Comment, SUB sequence when H—2 

C/F(10«P(1 | R—M(C) 

/F(16)*P(2)/ A<A sub R, 

C/F(13«*P(2/ H<F(ADDR), 

C/F(13«P(3/ | C—D, F—CM(H), 

Comment, ЛОМ sequence when Н=3 

/Е(18)*Р(1)/ IF (A(0) THEN (D<-R(ADDR)), 

C/F(13«P(2/ | H-——F(ADDR), 

C/F(13«P(3/ | C—D, F—CMC(H), 

Comment, STO sequence when H—4 

/F(14)«P(1) КА, 

/F(14)*P(2)/ M(C)<R, 

C/F(13)«P(2/ | H—F(ADDR), 

C/F(13«P(3/ | C—D, F<CM(H), 

Comment, JMP sequence when H—5 

/F(17)P«(1)/ D<R(ADDR), 

C/F(13)*P(2)/ H<F(ADDR), 


Sec. 3.2 А Microprogrammed Computer 95 


C/F(13«P(3/ CD, F—CM(H), 
Comment, SHR sequence when H—6 
/Е(19)*Р(1)/ А ‹<—ѕһг А, 
C/F(13«P(2/ | H-——F(ADDR), 
C/F(13«P(3/ | C—D, F—CM(H), 
Comment, CIL sequence when H—7 
/F(20)«P(1) Acil A, 
С/Е(13)*Р(2)/ | H-—F(ADDR), 
C/F(13)*P(3)/ C-D, F—CM(H), 
Comment, CLA sequence when H—8 
С/Е(10)*Р()/ | R—M(C), 
/F(21)«P(1)/ A«—0, 
C/F(15))«*P(2/ A<A add В, 
C/F(13«P(2/ | H—F(ADDR), 
C/F(13«P(3/ | C—D, F—CM(H), 
Comment, STP sequence when H—9 
/F(22)*P(1)/ G0, 
C/F(13«P(2/ | H-—F(ADDR), 
C/F(13«P(2) | H—F(ADDR), 
C/F(13)*P(3)/ C-D, F—CM(H), 
END 
The above starting operations need explanation. When the START switch is 


turned to the ON position, register G and bit F(12) are both set to 1. This causes the 
execution of the following micro-operation: 


IF (G) THEN (F—CM(H), C—R(ADDR)) ELSE (H—0, C—0, (3.11) 
2<0, R—0) 


The clock phase following the turning of switch START can be P(1), P(2), or P(3). 
If it is P(1) or P(2), nothing occurs. If it is P(3), the above conditional microstatement 
15 executed. Since С has become 1, the execution of the above microstatement reads 
the first control word out of the memory and the computer operation is thus started. 
On the other hand, if the STOP switch is turned to the OFF position when the com- 
puter is operating, the computer continues execution of the current instruction until 
clock phase P(3) of the fetch sequence is reached. At that time, no control word is 
read out of the control memory, and the- computer is in the waiting state until the 


START switch is turned on. 
Letter C appears at the first position of some of the above execution statements. 


96 Chap. 3 MICROPROGRAMMING 


The presence of letter C in a statement denotes that the statement is a comment 
statement. These comment statements are not needed because they have appeared 
elsewhere. However, their presence makes the description more readable. 


3.2.5 Microprogram - 


As shown, the sequence description (3.10) is divided into 13 groups. Each group 
is headed by a comment statement and specifies the steps of micro-operations of a 
particular sequence. The first three groups describe manual controls of start and stop 
operations and the fourth group specifies the fetch sequence. Each of the remaining 
groups specifies one of the nine execution sequences. The micro-operations in the 
first three groups are controlled manually and thus do not appear in the micropro- 
gram. The micro-operations in each of the other groups are controlled by one control 
word. For the 10 groups, there are 10 control words in the microprogram, one for 
the fetch sequence and the other nine for the nine execution sequences. The micro- 
program of the computer is shown in Table 3.2. 


TABLE 3.2 The Microprogram of the Microprogrammed Computer 


x 


Code F(0-9) Е(10) Е(11) F(12) F(13) F(14) F(15) F(16) Е(17) Е(18) F(19) F(20) F(21) F(22) 
0 


> 


FETCH 0 
ADD 9 
SUB 9 
JOM 9 
STO 9 
JMP 9 
SHR 9 
CLS 9 
CLA 9 
STP 9 


со 3 хол > ошо гә — 0 
оноооо2 снн - 
$$$ = 
зе е= 
ка ка ка мына һа ка шш юш 

фз нос 
= зоо -> 
соооососон-сос 
Фен 
Фе ес 
фе 
© Фк фо oO 
=> 
ооо оосо 


Note, H=control memory address 
Е(0-9)--пехі control memory address, shown in decimal number 
F(10-22)=control bits 
F(23) is not used and not shown 


Since the size of the microprogram affects the capacity and speed of the control 
memory, the microprogram control is so organized that both the number and the 
length of the control words can be reduced. In the sequence description (3.10), there 
are 20 micro-operations (excluding those by manual control) as shown in Table 3.3. 
If one control bit is assigned to each micro-operation, the control word would require 
20 bits in addition to the 10 address bits or a total of 30`6 $. The control word in 
Table 3.1, however, has a word length of only 24 bits (with one bit unused). This 
reduction from 30 to 24 bits is achieved in a number of ways. One way is to group 
micro-operations under one control signal. For example, in Table 3.1, two micro- 


Sec. 3.2 А Microprogrammed Computer 97 


TABLE 3.3 Micro-operations of the Microprogrammed 


Computer 
MICRO-INSTRUCTIONS 
Operands or Constants 
Test 
MICRO-OPERATIONS Op-code Condition Ist 2nd 3rd 
R—M(C) 0 R M 
М(С)<— В. 1 M R 
F«—CM(H) 2 G F CM 
H«—R(OP) 3 H R(OP) 
H«—F(ADDR) 4 H F(ADDR) 
C<-R(ADDR) 5 G С R(ADDR) 
D«—R(ADDR) 6 A(0) D R(ADDR) 
D«—countup D 7 D D 
A«—shr А 8 A A 
A«—cil A 9 A A 
A<A add В 10 A A R 
А <А sub R 11 A A R 
RA 12 R А 
C—D 13 C D 
К<—0 14 О’ R 0 
A<—0 15 А 0 
Но 16 G’ H 0 
C—0 17 С” С 0 
D—0 18 С” D 0 
G—0 19 G 0 


operations are grouped under control signal F(11)«P(2) because they occur at the same 
time. But this reduction gives up the use of each individual micro-operation if such 
a use occurs later. Another way of reducing the control bits is to group those micro- 
operations which can take place in the different phases of the same clock cycle under 
one control bit. For example, as shown in Table 3.1, micro-operation H——F(ADDR) 
at clock phase P(2) and micro-operations C—D and F-—CM(H) both at clock phase 
P(3) can be grouped under control bit F(13). This reduction again limits the future 
use of the individual micro-operations. А third way of reducing control bits is to use 
a decoder. Let the six control bits Е(17-22) be replaced by three bits F(17-19), and 
attach decoder to these three bits as specified by the decoder statement: 


Decoder, K(0-5)—F(17-19) (3.12) 


The six terminals of the decoder are K(j) where / is equal to 0-5. The six new control 
signals are К(0)*Р(1),..., К(5)*Р(1). This reduction is possible because only one 
of the six micro-operations occurs at any one time. But the use of the decoder limits 
the simultaneous operation of more than one of these six micro-operations in the 
future. In short, the reduction of control bits is at the expense of flexibility of micro- 
program control for reprogramming. 


98 Chap. 3 MICROPROGRAMMING 


In a commercial, large-scale computer there are a very large number of micro- 
operations for a large instruction set, and the size of the control memory may become 
impractical. Therefore, the tradeoff between the size of the microprogram and the 
limitation on flexible microprogramming is an interesting design problem. 


3.3 A Stored Logic Computer 


The microprogrammed computer described previously employs a read-only 
control memory. The use of this additional random access memory contributes to 
the cost of a microprogrammed computer. It is possible, at the expense of speed, to 
retain the microprogram control without the use of the extra memory if a part of the 
main memory is used as the control memory. Since the main memory is a read-write 
memory, the microprogram now stored in the main memory could also be changed 
or expanded. Such a microprogrammed computer is called a stored logic computer. 

In order to illustrate the stored logic computer, the control memory of the 
previously described microprogrammed computer is eliminated. Instead, the first 
1024 locations of the main memory are used as the control memory. Thus, the main 
memory stores two kinds of instructions, the conventional instruction and the micro- 
instruction. 


3.3.1 Configuration 


The configuration of the stored logic computer is shown in Fig. 3.6. It is described 
by the following declaration statements: 


Comment, configuration of the stored logic computer (3.13) 
Register, R(0—23), $memory buffer register 

A(0—23), $accumulator 

C(0—14), $memory address register 

D(0—14), $program register 

Е(0-23), $micro-instruction register 

OPR(0-14), Форегапа address register 

G, $start-stop control register 


Subregister, R(OP)—R(0—5), 
R(ADDR)=R(9-23), 
F(ADDR)=F(0-9), 

Memory, М(О)-- М(0-32767,0-23), 

Switch, POWER(ON), 


Sec. 3.3 A Stored Logic Computer 99 


c ELE 


F(ADD 
оо (ОР) | R(ADDR) 


j 


BM NN 
Parallel adder/subtracter 


Q 
Control signals 


ON ON ON 


Fig. 3.6 Configuration of a stored logic computer 


START(ON), 
STOP(ON), 
Clock, Р(1-6), 
The above configuration differs from the configuration described by statements (3.6) 
in the elimination of control memory CM and its address register H, in the provision 


of operand address register OPR, and in the replacement of a three-phase clock by a 
six-phase clock. 


100 Chap. 3 MICROPROGRAMMING 


3.3.2 Sequential Operations 


The micro-operations of this computer may be divided into five types: those to 
initialize, to start, to stop, to fetch an instruction, and to execute an instruction. The 
initializing micro-operations are 


G—0, F—0, D-—2000,, OPR—2000,, (3.14) 


When register G is reset to 0, the computer is in the stop state. If register F is reset 
to 0, no control signal is generated by the control word. Registers D and OPR are set 
to 2000, so that the first instruction is the word located at address 2000,. 

The starting sequence consists of three steps: 


step 1, G—1, C—0, (3.15) 
step 2, IF (G=1) THEN (R-—M(CO)), 
step 3, Е-Е, COPR, 


When register G is set to I, the computer enters the go state. Address register C is 
reset to 0 and the memory read operation is initiated. The first micro-instruction 
located at address 0 is next read out of the memory into register R. The micro-instruc- 
tion is then transferred to register F and register C is set to address 2000, where the 
first instruction is stored. 

The stopping sequence consists of two steps: 


step 1, G<0, (3.16) 
step 2, IF(G=0) THEN (F——0, D-—2000,, OPR<—2000,), 


When register G is reset to 0, registers F, D, and OPR are initialized. The computer 
is then ready to be started again. 
The fetch sequence consists of four steps: 


step 1, ЕК«-М(С), (3.17) 
step 2, C—R(OP), OPR-—R(ADDR), D<countup D, 

step 3, R<M(C), 

step 4, Е-Е, COPR 


There are two accesses of the memory during the fetch sequence. The first access 
fetches the instruction. The instruction is normally located by the address in the D 
register, but by the address in the OPR register when the computer is first started. 
The second access fetches the next micro-instruction located by the op-code part of 
the instruction obtained by the first access of the memory. The micro-instruction is 
then transferred to the F register. 

An instruction-execution sequence consists of at least four steps: 


Sec. 3.3 A Stored Logic Computer 101 


step 1, К«-М(С), (3.18) 
step 2, C—F(ADDR), 

step 3, К<-М(С), 

step 4, Е-Е, C-D, 


Again, there are two accesses of the memory. The first access fetches the operand 
located by the address in the OPR register. The second access fetches the next micro- 
instruction located by the address part of the previous micro-instruction. The micro- 
instruction is transferred to the F-register. The next instruction address in register 
D is transferred to address register C. 


3.3.3 Control Signals 


Each clock cycle consists of six clock phases. There are two memory cycles in 
each clock cycle, one memory cycle for fetching the micro-instruction and the other 
for fetching the operand. The steps for these two fetches and their control signals are 
described below. 


/P(6)/ СР ог C—OPR $beginning of a memory cycle (3.19) 
/Р()/ R«-M(C) 
/P(2)/ $end of a memory cycle 


/P(3)/ C<-F(ADDR) ог C—R(OP) Sbeginning of a memory cycle 

/Р(4)/ R«—M(C) 

/P(5)/ $end of a memory cycle 
The micro-operations and their control signals for writing a word into the memory 
are described below. 

/P(6)/ C«—OPR  S$beginning of a memory cycle (3.20) 

/Р(1)/ 

/Р(2)/ M(C)—R $end of a memory cycle 


As shown above, each memory cycle coincides with three clock phases. 


3.3.44 Control Word Format 


After the steps for initialization, start, stop, instruction fetch, and instruction 
executions are established, control word format is next chosen. In this computer, the 
format is chosen essentially the same as that shown in Table 3.1 except for those 
changes that are necessary to adopt the use of a part of the main memory as the 
control memory. This format is shown in Table 3.4. 


102 Chap. 3 MICROPROGRAMMING 


TABLE 3.4 Control Word Format 


CONTROL CLOCK MICRO-OPERATIONS 
BIT PHASE 
Е(0-9) Next micro-instruction address 
F(10) РІ) ^; ВМО 
F(11) P(2) C«—R(OPD), D<—countup D 
F(12) P(4) IF (б) THEN (R-—M(C), OPR-—R(ADDR)) 
ELSE (F«—0, D<—2000s, ОРК<-20004) 
P(6) F<R, С-ОРЕ 
F(13) P(3) C«—F(ADDR) 
P(4) R<M(C) 
P(6) F<R, CD 
F(14) P(1) В А 
Р(2) M(CO)-——R 
F(15) P(2) А<—А add R 
F(16) Р(2) А<—<А sub R 
ват) Р(1) D«—OPR 
F(18) P(1) IF (A(0) THEN (D<—OPR) 
F(19) P(1) A<shr A 
F(20) P(1) A«—cil A 
F(21) Р(1) A<0 
F(22) Р(1) G«—0 
FQ3) Not used 


3.3.5 Sequence Description 


With the control word format in Table 3.4, the sequence chart in Fig. 1.5 can be 
described by the following execution statements: 

Comment, sequences of the stored logic computer (3.21) 

Comment, initialization 

/POWER(ON)/ G<—0, Е<0, D-—2000,, OPR<—2000,, 

Comment, start operation 

/[START(ON)«P(3/ G-1, F(12)—1, C—90, 

Comment, stop operation 


/STOP(ON)/ G0, 

Comment, fetch sequence when C=0 

/F(10)«P(1)/ R—M(C), 

/Е(11)*Р(3)/ С‹<-К(ОР), D<countup D, 

/F(12)*P(4)/ IF (С) THEN (К«-М(С), OPR<-R(ADDR)) 


ELSE (Ғ<-0, D<—2000,, OPR<—2000,), 


Sec. 3.3 A Stored Logic Computer 103 


[F(12)«P(6)/ F—R, C—OPR, 
Comment, ADD sequence when C—1 
C/F(10)«P(1)/ R—M(C), 
[F(15)P«(2)/ К«-М(С), 
[Е(15)*Р(2)/ A<—A add R, 
/F(13)*P(3)/ C<-F(ADDR), 
/F(13)*P(4)/ R—M(C), 
/F(13)*P(6)/ F—R, C-D, 
Comment, SUB sequence when C—2 
C/F(10)«P(1)/ R—M(C), 
/F(16)*P(2)/ А <А sub К, 
C/F(13)«P(3)/ C-—F(ADDR), 
C/F(13)«P(4)/ R—M(Q), 
C/F(13)*P(6)/ Е-Е, C-D, 
Comment, JOM sequence when C=3 
[F(18)«P(1)/ IF (А(0)) THEN (D-—OPR), 
C/F(13)«P(3)/ C-—F(ADDR), 
C/F(13)«P(4)/ R-—M(C), 
C/F(13)«P(6)/ F—R, C-D, 
Comment, STO sequence when C—4 
[F(14)«P(1)/ R<A, 
/F(14)*P(2)/ M(C)—R, 
C/F(13)P«(3)/ C—F(ADDR), 
C/F(13)«P(4)/ R—M(Q), 
С/Е(13)«Р(6)/ Е-Е, C-D, 
Comment, JMP sequence when С--5 
/Е(17)*Р(1)/ D<—OPR, 
C/F(13)«P(3)/ C-——F(ADDR), 
C/F(13)*P(4)/ R—M(Q), 
С/Е(13)«Р(6)/ F—R, C-D, 
Comment, SHR sequence when C=6 
/F(19)*P(1)/ A<shr А, 
C/F(13)«P(3)/ C-—F(ADDR), 
C/F(13)«P(4)/ R—M(C), 


104 Chap. 3 MICROPROGRAMMING 
C/F(13)«P(6)/ Е-Е, C-D, 
Comment, CIL sequence when C—7 
/F(20)«P(1)/ A —cil A, 
С/Е(13)*Р(3)/ C—F(ADDR), 
C/F(13)«P(4)/ R—M(C), 
C/F(13)*P(6)/ F—R, C-D, 
Comment, CLA sequence when C=8 
C/F(10)«P(1)/ R—M(C), 
/F(2D)«P(1)/ A<0, 
C/F(15)«P(2)/ A<A add R, 
C/F(13)P«(3)/ C-—F(ADDR), 
C/F(13)«P(4)/ R—M(C), 
C/F(13)«P(6)/ F—R, C-D, 
Comment, STP sequence when C—9 
/F(22)*P(1)/ G0, 
C/F(13)*P(3)/ C-——F(ADDR), 
C/F(13)«P(4)/ R—M(C), 
C/F(13)*P(6)/ F—R, C—D, 

END 


3.3.6 Місгоргодгат 


Similar to description (3.10) for the microprogrammed computers, the sequence 
description for the stored logic computer is also divided into 13 groups. Each group 
is headed by a comment statement and specifies the micro-operations of a particular 
sequence. Because the micro-operations of the first three groups are manually con- 
trolled, they do not appear in the microprogram. The micro-operations of the remain- 
ing 10 groups are converted into 10 micro-instructions which form the microprogram. 
Owing to the similar formats of the control words in Tables 3.1 and 3.4, the micro- 
program in Table 3.2 is also the microprogram for the stored logic computer. 


3.4 A Microprogrammed Sequence 


As a final example of microprogram control, the sequencé for finding the largest 
number among the л unsigned binary numbers in Chapter 2 is chosen. The configura- 
tion of this sequence was described in the block diagram in Fig. 2.7, the sequence 
in the chart in Fig. 2.8, and the statement description in statements (2.12). 


Sec. 3.4 A Microprogrammed Sequence 105 


3.41 Microprogram Control Configuration 


The microprogram control configuration is shown in the diagram in Fig. 3.7. 
Control memory CM has a capacity of 256 36-bit words with address register H and 


Control memory 
CM(0-255, 1-36) 


P(1) 
№ 
clock ва] 2 
P(3) 


МС(0-3 


В 


—$—— / 
Control signals 


Fig. 3.7 Microprogram control configuration for finding-the- 
largest-number sequence 


buffer register Е. The four-phase clock Р(0-3), the single-bit register D, and the 4-bit 
register MC generate the control signals. Register STATUS is a status bit. Register 
READ initiates a memory read operation. Register MEMSTART initiates memory 
operation after being turned to 1 by switch RESET. Switch RESET initiates the 
operation. The above is now described in the following: 


Comment, the microprogram control configuration (3.22) 
Register, | H(1-8), $control memory address register 
F(1-36), $control word register 


MC(0-3), $main memory cycle register 


106 Chap. 3 MICROPROGRAMMING 


D, $memory cycle wait register 
STATUS, $status bit 

READ, $initiate memory read operation, 
MEMSTART, $start memory operation 


Block, ADDRESS(IF (STATUS) THEN (H-—F(1-8)) ELSE (H<-countup Н)) 
Memory, СМ(Н)--СМ(0-255,1-36), 

Switch, RESET(ON), $start switch, 

Clock, Р(0-3), $four-phase clock 

Comment, register MC is sequenced as a ring counter 

/P(3)/ MC-cir MC, 


3.4.2 Timing and Control Signals 


Each main memory cycle is chosen to coincide with four control memory cycles, 
and each control memory cycle to coincide with one clock cycle. Therefore, there 
are four steps in each control memory cycle and 16 steps in each main memory cycle. 
The control signals for these 16 steps are described by the following sequence of 16 
labels, in addition to the starting label and the label for cycling register MC. 


Comment, control signals described by the labels (3.23) 

/RESET(ON) MC-—8, MEMSTART-1, 

/MC(0)«P(0) $beginning of both mm and cm 
cycles 

/MC(0)*P(1)/ 

/MC(0)«P(2) 

/MC(0)«P(3)/ $end of a cm cycle 

/P(3)/ MC-cir MC 

/MC(1)*P(0)/ $beginning of a cm cycle 

/MC(1)*P(1)/ 

/MC(1)*P(2)/ 

/MC(1)*P(3)/ $end of a cm cycle 

/P(3)/ MC<cir MC 

/MCQ)«P(0)/ $beginning of a cm cycle 

/MC(2)*P(1)/ i 

/MC(2)*P(2)/ 


/MCQ)P(3)/ $end of a cm cycle 


Sec. 3.4 А Microprogrammed Sequence 107 


/P(3)/ MC<cir MC 

/MC(3)«P(0)/ Sbeginning of a cm cycle 
/MC(3)*P(1)/ 

/MC(3)*P(2)/ 

/MC(3)«P(3)/ $end of a cm cycle 
/P(3)/ MC<cir MC 


The above four steps in each control memory cycle are sequenced by the four phases 
of clock P(0-3), and the four control memory cycles in each тат memory cycle are 
sequenced by the four states of ring counter MC(0-3) which is circularly right-shifted 
at the end of each control memory cycle. 

During each main memory cycle, a word is either read out of or written into 
the main memory X. Assume that the transfer of the main memory address register 
C and the initiation of the main memory read or write operation occur during the 
second step. For a read operation, the word is available at buffer register R during 
the sixth step. Thus, we have: 


/MC(0)*P(1)/ C-K, (3.24) 
/MC(I)«P(1) R—X(Q), 


For a write operation, the word to be stored into the main memory is transferred into 
buffer register R before the 12th step. Then, we may have: 


/MC(O«P(/ C-K, (3.25) 
/МС(2)*Р(3)/ X(C)—R, 


Certain micro-operations occur in every control memory cycle. The labels for 
describing the control-memory control signals are: 


/Р(0)*О’/ Sbeginning of a cm cycle (3.26) 
/P(D«D'/ 

/P(2)*D'/ 

/P(3)«D'/ Send of a cm cycle 


When register D contains a 0, the above labels appear; otherwise, they disappear. 
Thus, register D controls the advance or halt of the steps in a control memory cycle. 

During each control memory cycle, a control word is read out of the control 
memory. Assume that the transfer of the control memory address to register H and 
the initiation of the control memory read both occur during clock phase P(3) of the 
preceeding control memory cycle, and the control word becomes available at buffer 
register F during clock phase P(0) of the current control memory cycle, or 


[P(3)«D'«F(15) DO ADDRESS, $end of the preceding cm cycle (3.27) 


108 Chap. 3 MICROPROGRAMMING 


/P(0)*D’xFQ9)/ | F——CM(H), $beginning of the current cm cycle 
/P(1)«D'/ 

/P(2)*D’/ 

/[P(3«D'«F(15/ DO ADDRESS, $end of the current cm memory cycle 


where control bits F(9) and F(15) are explained below. Micro-operations activated 
by the control word in register Е are executed during clock phases Р(1-3) of the cur- 
rent control memory cycle, as the control word is fetched during clock phase P(0). 


3.4.3 Control Word Format 


Table 3.5 shows the format of the control word. The 36 bits of the control word 
in register F are divided into three groups: field Е(1-8) which contains a control 
memory address, field Ғ(9-21) which contains the control bits, and field F(23-36) 
which is not used. Field F(1—8) provides a branching address for each control word. 
Each bit in Field F(9-22) together with the respective clock phase is assigned to 
control one or more micro-operations; thus, each micro-operation or a group of 
micro-operations are given a control signal. 


TABLE 3.5 Control Word Format 


CONTROL CLOCK 
Bir PHASE MICRO-OPERATIONS 
F(1-8) Control memory address 
F(9) P(O) ЕСМ 
F(10) P(1) С<-0, FINI—OFF, D<—1 
Е(11) Р(1) READ -—1 (or Е<-Х(С)) 
F(12) P(1) K<R(ADDR) 
F(13) P(1) J-K 
Е(14) Р(2) D<—0 
Е(15) Р(3) IF (STATUS) THEN (Н<—Е(1-8)) ELSE (H<—countup Н) 
F(16) P(1) CK 
F(17) P(1) A—R 
F(18) Р(1) K<countdn K 
F(19) Р(1) IF (BOR(0)—1) THEN (АБВ, J«—K) 
F(20) P(3) STATUS-—0 
F(21) Р(1) IF (К =0) THEN (FINI<—ON, 5ТАТО5-<-1) 
Е (22) P(1) STATUS<1 
F(23-36) Not used 


3.4.4 Microprogramming the Sequence 


With the control word format in Table 3.5, the sequence chart in Fig. 2.8 can now 
be described by execution statements. The following description consists of two 
parts, the initiation and the comparison. 


Sec. 3.4 А Microprogrammed Sequence 109 


Comment, the finding-the-largest-integer sequence (3.28) 
Comment, initialization part 

/RESET(ON)/ МС<8, H—0, 2-0, MEMSTART-1, FINI—OFF, 
/P(3)/ MC--cir MC, 

Comment, beginning of a main memory cycle (3.29) 
/P(0)*D’*F(9)/ F<-CM(H), $fetch a control word (H=0) 
/MC(0)«P(0)/ 


/MC(O)«P(D«F(10/ С<0, FINI-OFF, D<-1, $ст control signals disabled 
/MC(O)«P(D*F(11 READ<1, 

/MC(0)«P(2)/ 

/MC(0)*P(3)/ 

/MC(I)«P(0)/ 

[МС(1)*Р(1)*Е(11)/ К<Х(С), 

/MC(1)*P(2)/ 

/MC(1)*P(3)/ 

МС(2)*Р(0)/ 

[МС(2)*Р(1)«Е(12)/ K«—R(ADDR), 

/MC(2)«P(2) 

/MC(2)*P(3)/ 

/MC(3)«P(0)/ 

/МС(3)*Р(1)жЕ(13)/ J<K, 

/MC(3«P(2)*F(14) D0, $cm control signals enabled 
/MC(3)«P(3)/ 

/MC(3&D'«F(15) DO ADDRESS, $H<countup Н 


Comment, beginning of a main memory cycle (3.30) 
/P(0)«D'«F(9)/ F«-CM(H), $fetch a control word (Н= 1) 
/MC(0)«P(0)/ 

/MC(0)*P(1)*F(16)/ СК, 

/MC(O&P(I)«F(11/ READ-1, 


/MC(0)*P(2)/ 

/MC(0)*P(3)/ 

/P(3)«D'«F(15) DO ADDRESS, $H<countup Н 
/P(0)«D'«F(9)/ F«-—CM(H), $fetch a control word (H=2) 


/MC(1)*P(0)/ 


110 


Chap. 3 MICROPROGRAMMING 
/МС(1)*Р(1)*Е(10)/ R—X(C), 
/MC(1)*P(2)/ 
/MC(D«P(3)/ 
[P)«D'«F(15)/ DO ADDRESS, $H<countup Н 
[Р(3)*0'*Е(9)/ F—CM(H), $fetch a control word (H=3) 
/MQ)«P(0)/ 
/MCQ)«P(D«xF(17/ А-В, 
/MCQ«P(I)«F(18/ K«countdn К, 
/MC(2)*P(2)/ 
/MC(2)*P(3)/ 
/P(3)*D’*F(15)/ DO ADDRESS, $H<countup Н 
[P(0)«D'«F(9)/ F-—CM(H), $fetch a control word (H=4) 
/MC(3)*P(0)/ 
/MC(3)*P(1)*F(21)/ IF (K=0) THEN (FINI<-ON, STATUS<1) 
/MC(3)*P(2)/ 
/MC(3)«P(3«F(20/ STATUS<0, 
/P(3)*D’*F(15)/ DO ADDRESS, $H<F(1-8) (H=5 or 8) 
Comment, comparison part (3.31) 


Comment, beginning of a main memory cycle 


/P(0)«D'«F(9)/ 
/MC(0)*P(0)/ 
/MC(0)*P(1)*F(16)/ 
/MC(0)«P(D)«F(11)/ 
/MC(0)*P(2)/ 
/MC(0)*P(3)/ 
/P(3)«D'«F(15)/ 
/P(0)«D'«F(9)/ 
/MC(1)*P(0)/ 
/MC(D&P(D«F(11)/ 
/MC(1)*P(2)/ 
/MC(1)*P(3)/ 
/P(3)*D’*F(15)/ 
[P(0)«D'«F(9)/ 
/MC(2)«P(0)/ 


F—CM(H), $fetch a control word (Н=5) 
C—K, 

READ-—1 

DO ADDRESS, $H<countup Н 

F-——CM(H), $fetch a control word (H —6) 
R—X(C), 


DO ADDRESS, 
Е—СМ(Н), . $fetch a control word (H=7) 


Sec. 3.4 А Microprogrammed Sequence 111 


/MC(2)*P(1)*F(19)/ IF (BOR(0)—1) THEN (A+R, J—K), 
[МС(2)*Р(1)*Е(18)/ K«countdn К, 
/MC(2)*P(1)F*(22)/ STATUS<1, 


/MC(2)*P(2)/ 
/MC(2)«*P(3)*F(20)/  STATUS<0, 
JP(3)«D'«F(15)/ DO ADDRESS, $Н<—Е (1-8) (Н=4) 


END 


In the above description, the initialization part takes almost the first two main 
memory cycles, and the comparison part takes one main memory cycle for each 
iteration. During the first main memory cycle, register D is set to 1 by control signal 
MC(0)*P(1)*F(10). When register D is set to 1, the generation of control signals in 
the control memory cycle is stopped (see the labels of statements (3.26)), while the 
generation of the control signals in the main memory cycle continues. The next 
control word is not fetched from the control memory until the beginning of the next 
main memory cycle; this gives time to access the main memory. As a result, the 
control word remains in register F during the first main memory cycle. Micro-opera- 
tion, МСе-сіг MC, is controlled by clock phase P(3) as it occurs at every control 
memory cycle. 


3.4.5 Microprogram 


The microprogram for the sequence of finding the largest number is shown in 
Table 3.6 where there are eight control words. The existence of these control words 


TABLE 3.6 Mlicroprogram for Finding the Largest Number Sequence 


F(1-8) F(9) F(10) F(11) F(12) F(13) F(14) F(14) F(15) F(16) F(18) F(19) F(20) F(21) F(22) 


0 0 0 0 


мс € 4 0 t-—0/| T 

E 

P 
is ee ids ee ИРАНЕ ЧА 
ooooooo- 
o--oo--L- 
ooooocooo- 
ooocoo oc co = 
ос-ооооссо-- 
ico ЕЕ ЕНЕТІН 
оон оо о-о 
oocoo-—-oo 
—ooo-oo 
—oooooo 
—-oo-oooo 
ooo-oooo 
-000000 


Note: H is the control memory address 


can be recognized by the presence of the following execution statement eight times, 
/P(0)«D'«F(9)/ F«-—CM(H) 


which reads a control word out of the control memory. Each of these eight control 


112 Chap. 3 MICROPROGRAMMING 


words is described in the sequence description (3.28) by the execution statements 
located in one control memory cycle. From this description, the 1’s and 0’s of each 
control word are obtained as follows. The 1’s are determined by those control bits 
whose values are 1’s in the labels of the execution statements in one control memory 
cycle. Thus, the first control memory cycle consists of the first eight execution state- 
ments in description (3.29). The control bits in the labels are F(9)-F(15); these are 
the bits which are marked 1 in the first control word in Table 3.6. The second control 
memory cycle consists of the first four execution statements in description (3.30). 
The control bits in the labels are F(9), F(11), F(15), and F(16); these are the bits 
which are marked 1 in the second control word in the Table. Other control words 
are similarly obtained. In this manner, the micro-program in Table 3.6 is prepared. 


References 


1. WiLkrs, M. V., “The Best Way to Design an Automatic Calculating Machine,” 
Manchester University Computer Inaugural Conference, July, 1951, pp. 16-18. 


2. WILKES, M. V., and STRINGER, J. B., ^Microprogramming and the Design of the Control 
Circuits in an Electronic Digital Computer," Proceedings of the Cambridge Philosophical 
Society 49, Part 2, 1953, pp. 230-238. 


3. GLANTZ, H. Т., “А Note on Microprogramming," Journal of the Association for Com- 
puting Machinery 3, No. 2, April, 1956, pp. 77-84. 
4. Mercer, R. J., “Micro-programming,” J. of the ACM, April, 1957, рр. 157-171. 


5. Waxes, M. V., “Microprogramming,” Proceedings of the Eastern Joint Computer 
Conference, December, 1958, pp. 18-20. 


6. BLANKENBAKER, J. V., “Logically Microprogrammed Computers,” JRE Transactions on 
Electronic Computer ЕС-7, June, 1958, pp. 103-109. 


7. DINEEN, С. P., LEBow, I. L., and ВЕРЮ, I. S., “Тһе Logic of CG24,” Proceedings of the 
Eastern Joint Computer Conference, December, 1958, pp. 91-94. 


8. WiLKES, М. V., RENWICH, W., and WHEELER, D. J., “The Design of the Control Unit 
of an Electronic Digital Computer,” Proceedings of the IEE (London), Vol. 105, Pt. B, 
June, 1959, pp. 121-128. 


9. КАМРЕ, T. W., “The Design of a General-purpose Microprogram-controlled Computer 
with Elementary Structure," IRE Transactions on Electronic Computers EC-9, June, 
1960, pp. 208-213. 


10. SEMARNE, Н. M., and Porter, К. E., “А Stored Logic Computer,” Datamation 2, No. 
4, May, 1961, pp. 33-36. 


11. GRAsELLI, A., “Тһе Design of Program-modifiable Microprogrammed Control Units,” 
IEEE Transactions on Electronic Computers ЕС-11, June, 1962, pp. 336-339. 


12. НАСГМАВЕ, H., “Тһе KT Pilor Computer—a Microprogrammed Computer with a 
Phototransitor Fixed Memory,” Proceedings of AFIPS Congress, 1962, pp. 318-321. 


13. GERACE, G. B., *Microprogrammed Control for Computing Systems," IEEE Transac- 
tions on Electronic Computers, December, 1963, pp. 733-747. 


Problems 113 


14. 


BouTWELL, E., JR., and HoskiNsoN, Е. A., “The Logical Orgainization of the PB 440 
Microprogrammable Computer,” Proceedings of the FICC, 1963, pp. 201-213. 


15. AMDAHL, L. D., “Microprogramming and Stored Logic,” Datamation 10, 2, 1964, pp. 
24-26. 

16. McGEE, W. C., "The TRW-133 Computer,” Datamation 10, 2, 1964, pp. 27-29. 

17. Нил, В. H., “Stored Logic Programming and Applications,” Datamation 10, 2, 1964, 
pp. 36-39. 

18. Beck, L., and KEELER, F., “The С-8401 Data Processor,” Datamation 10, 2, 1964, pp. 
33-35. 

19. BouTWELL, Е. O., Л., “The PB440 Computer,” Datamation 10, 2, 1964, pp. 30-32. 

20. STEVENS, W. T., "The Structure of SYSTEM 360. Part II. System Implementation,” 
IBM System J. 3, 1964, pp. 136-143. 

21. Tucker, S. G., “Emulation of Large Systems,” Communications of the Association for 
Computing Machinery 8, No. 12, December, 1965, pp. 753-761. 

22. , “Microprogram for System/360," ІВМ Systems Journal 6, No. 4, 1967, рр. 
222-241. 

23. WEBER, H., “А Microprogrammed Implementation of EULER on IBM System/360 
Model 30," Communications of the ACM, September, 1967, pp. 549-558. 

24. Husson, S., *Microprogramming Manual for the 360 Model 50," TR 00.1479, ІВМ 
System Development Division, October 2, 1967. 

25. FLYNN, M. J., and MacLaren, M. D., “Microprogramming Revisited,” Proceedings of 
the ACM National Meeting, 1967, рр. 457-464. 

26. WILKES, M. V., “The Growth of Interest in Microprogramming: A Literature Survey,” 
Computing Surveys, December, 1969, pp. 139-145. 

27. ROSIN, R. F., “Contemporary Concepts of Microprogramming and Emulation,” Com- 
puting Surveys, December, 1969, pp. 197-212. 

28. Husson, S. S., Microprogramming: Principles and Practices. Englewood Cliffs, N.J.: 
Prentice-Hall, Inc., 1970. 

29. CHU, Y., Introduction to Computer Organization. Englewood Cliffs, N.J.: Prentice-Hall, 
Inc., 1970. 

Problems 


3.1. Construct the control cycle chart (similar to Fig. 3.4) for the microprogrammed 


sequence for finding the largest number. 


3.2. Construct the control cycle chart for the microprogrammed computer described by 


statement (3.10). 


3.3. Let octal 12 be the op-code for the instruction which circulates the contents of register 


A three bit-positions to the right. Write a sequence of micro-instructions to implement 
this instruction in the microprogrammed computer. 


114 


3.4. 


3.5. 


3.6. 
3.7. 


3.8. 


3.9. 


3.10. 


3.11. 


3.12. 


3.13. 


3.14. 


3.15. 


Chap. 3 MICROPROGRAMMING 


Use a programming language such as Fortran, Algol, ог PL/1 to describe the micro- 
programmed computer described by statements (3.6) and (3.10). 


Table 3.3 lists the micro-operations of the microprogrammed computer described by 
statements (3.6) and (3.10) together with their five fields. If each micro-operation is 
regarded as a micro-instruction, prepare a microprogram by using these micro-instruc- 
tions to implement the microprogrammed computer. Other micro-operations (thus, 
micro-instructions) may be added if they are needed. 


Construct the control cycle chart for the stored logic computer. 


Repeat Problem 3.4 for the stored logic computer described by statements (3.13) and 
(3.21). 


Suggest a solution so that the microprogram in the stored logic computer can be 
protected from the mistaken access by the user's program. 


Write an assembly language program for finding the largest number described by the 
flowchart in Fig. 2.6, and then determine the number of main memory cycles required 
for executing the assembly language program. Compare the number of main memory 
cycles thus found with the number of main memory cycles required for executing the 
microprogram described by statements (3.28). 


Use the control configuration similar to that described by statements (3.22) and 

microprogram: 

(a) the serial comparison sequence for two unsigned numbers described by the 
sequence chart in Fig. 2.4, and 

(b) the serial comparison sequence for two signed numbers described by the sequence 
chart in Fig. 2.5. 


Use the control configuration similar to that described by statements in (3.22) and 
microprogram the sequence for generating the first 1000 prime numbers shown by 
the sequence chart in Fig. 2.13. 


Conceive a control configuration and microprogram the serial Gray-to-binary code 
converter described by the sequence chart in Fig. 2.15. 


Repeat Problem 3.12 for the binary-to-decimal converter described by the sequence 
chart in Fig. 2.18. 


Repeat Problem 3.12 for the Mercer's multiple addition described by the sequence 
chart in Fig. 2.20. 


Repeat Problem 3.12 for the bowling score computer described by the sequence chart 
in Fig. 2.22. 


Arithmetic instructions of a stored-program digital computer are executed т ап 
arithmetic unit. If the arithmetic unit adds and subtracts binary numbers in parallel 
(that is, all bits of two numbers are being added at the same time), it is a parallel, 
binary arithmetic unit. In a parallel arithmetic unit, there is a parallel adder or a 
parallel subtractor, or there can be both, depending on the arithmetic algorithms 
that are employed. 

When binary numbers are represented by a sign bit and one or more number 
bits with a binary point located fixedly somewhere between two neighboring 
points, such binary numbers are called fixed-point binary numbers. When binary 
numbers are expressed in an exponent form as to be further described in the next 
chapter, they are called f/oating-point binary numbers. The arithmetic unit which 
handles fixed-point numbers is called a fixed-point arithmetic unit, while the 
arithmetic unit that handles floating point numbers is called floating-point arith- 
metic unit. It is possible, of course, to have an arithmetic unit which can handle 
both fixed-point and floating-point numbers. 

This chapter describes a parallel arithmetic unit which is capable of perform- 
ing fixed-point binary addition, subtraction, multiplication, and division. This arith- 
metic unit is shown first in sequential logic control and then in microprogram 
control. A parallel, binary arithmetic unit capable of performing floating-point 
arithmetic is presented in the next chapter. 


A Fixed-Point Arithmetic Unit 


4.1 Configuration of the Arithmetic Unit 


The description of the arithmetic unit begins with the fixed-point number format, 
followed by the configuration and the parallel adder. The computer elements required 
for sequencing addition, subtraction, multiplication, and division will be described 
in a later section. 


4.1.1 Fixed-point Number Representation 


A binary number may be represented in a fixed-point format. In this format, 
there is a sign bit and one or more number bits, and the location of the binary point 
can be chosen between any two neighboring bits. There are two most commonly 
chosen locations of binary point; one choice makes the number represented as an 
integer, and the other makes the number represented as a fraction. Figure 4.1 shows 


Sign bit 


1 Number bits 


Fig. 4.1 A fixed-point number format 


a fixed-point format where the sign bit 1$ located at the left of the most significant 
number bit position. If the binary point is located between the sign bit and the most 
significant number bit, binary numbers with such a location of binary point are 
fractions. If the binary point is located to the right of the least significant number bit, 
the numbers so represented are integers. 

The fixed-point number format for the parallel, binary, fixed-point arithmetic unit 
is the format shown in Fig. 4.1. There are 35 number bits in addition to a sign bit. 
Thus, there are 36 bits in total. Integer representation is adopted. The binary numbers 
are represented in the signed magnitude representation. When the sign bit is 0, the 
binary number is positive; otherwise, the binary number is negative. Whenever nega- 
tive zero occurs, it is set to positive zero. It should be noted that there is no hardware 
which directly implements the binary point; instead, the binary point is indirectly 
implemented in the arithmetic algorithms. 


116 


Sec. 4.1 Configuration of the Arithmetic Unit 117 


4.1.2 Configuration 


The parallel, binary, fixed-point arithmetic unit consists of four registers AC, 
MQ, SR, and SC, which are also known as accumulator, multiplier-quotient register, 
storage register, and shift counter, respectively. In addition, there are a parallel adder 
and three single-bit indicators, C, ADOV, and DVOV. The configuration is shown 
in Fig. 4.2. Each of registers AC, MQ, and SR is divided into two parts: one of them 


"n 
unm 
ATTE lE 


Fig. 4.2 Configuration of a parallel, binary, fixed-point arith- 
metic unit 


is for the sign and the other for the magnitude. To be specific, subregisters AC(S), 
MQ(S), and SR(S) hold the sign bits, and subregisters AC(M), MQ(M), and SR(M) 
store the number bits. Register SC is used as a counter during multiplication and 
division. Register C is used to control modification of multiplier bits during multi- 
plication. Registers ADOV and DVOV indicate overflow during addition and divi- 
sion, respectively. 

The configuration in Fig. 4.2 is now described by the following statements: 


Comment, configuration of the fixed-point arithmetic unit (4.1) 
Register, АС(5,Е,0,1-35), $accumulator 
SR(S, 1-35), $storage register of the memory 
MQ(1-35), $multiplier-quotient register 
5С(0-5), $shift counter 
С; Scontrol bit during multiplication 


ADOV, $addition overflow indicator 


118 Chap.4 А FIXED-POINT ARITHMETIC UNIT 


DVOV, $division overflow indicator 
Subregister, АС(М)=АС(1-35)  $magnitude part of register AC 
SR(M)-SR(1-35)  $magnitude part of register SR 
MQ(M)=MQ(1-35) $magnitude part of register MQ 
Comment, description of the parallel adder (4.2) 
Terminal, ^ ADD(R,Q,I-35) — ADSR(R,Q,1-35) OADAC(R,Q,1-35) 
(2C(R,Q,1-35), i 
C(R,Q,1-34)=ADSR(Q,1-35)*ADAC(Q, 1-35) + ADAC(Q, 1-35) 
*C(Q,1-35)+C(Q,1-35)*ADSR(Q, 1-35), 


С(35)--0, 
Comment, declaration of terminals 7, (4.3) 
Terminal, Z(R,Q,1-35)=0-0-AC(M) add2 0-0-SR(M)-0 
Comment, description of operator add2 (4.4) 


Operator, | W(R,Q,1-35)=X(R,Q,1-35) add2 Y(R,Q,1-36) 
Terminal, — C(35)—Y(36), 
C(R,Q,1-34)— X(Q,1-35)« Y (Q,1-35) 4- Y(Q,1-35)«C(Q,1-35) 
4-C(Q,1-35)«X(Q,1-35) 
/begin/ W(R,Q,1-35)=X(R,Q,1-35)@Y(R,Q,1-35)@C(R,Q, 1-35), 


end of operator 


In the above statements, register AC contains bit AC(Q) which is located between 
the sign bit and the most significant number bit of register AC. Bit AC(Q) stores 
the carry or borrow from the most significant number bit during addition or subtrac- 
tion. As will be shown, this bit will then be examined to determine if there is an over- 
flow condition after an addition, or an underflow condition after a subtraction. 
Register AC contains another bit called AC(R) which stores the carry from bit AC(Q) 
during multiplication. As will be shown, this extra bit of AC(R) is needed because 
a particular two-bit multiplication algorithm is employed. 

The above parallel adder, terminals Z, and operator add2 are further described 
below. 


4.1.3 Parallel Adder 


A parallel adder adds the respective bits of two binary numbers simultaneously. 
It usually consists of single-bit full adders which have three inputs (an augend bit, 
addend bit, and input-carry bit) and two outputs (a sum bit and output-carry bit). 
Let these inputs and outputs be named as those shown іп Fig. 4.3, where ADAC(i) 


Sec. 4.1 Configuration of the Arithmetic Unit 119 


ADSRG ; 
SR() ADAC(i) Cli) 


Single-bit 


full adder 


C(i-1) 
ADD(i) 


Fig. 4.3 A single-bit full adder (jth stage of a parallel adder) 


is the ith augend bit, ADSR(i) the ith addend bit, С(1) the ith carry bit, C(i-1) the (i-1)th 
carry bit, and ADD() the ith sum bit as index i represents the ith stage of the parallel 
adder. This full adder can be described by the following two Boolean functions: 


ADD(i)—ADSR(G)OADAC()OC() (4.5) 
C(i-1)-ADSR()«ADAC()--ADAC()«C()-- C()«ADSR() (4.6) 


Let the names of the inputs and outputs of the parallel adder for the fixed-point 
arithmetic unit be those shown in Fig. 4.4. Inputs ADAC(R,Q,1-35) and ADSR(1-35) 
are normally connected from the outputs of registers AC(R,Q,1-35) and SR(1—35), 
respectively. Input ADSR(R,Q) is normally not connected from the SR register; 
instead, constants 0’s or 175 are inserted at these two inputs whenever necessary. 
Input C(35), normally 0, is the input carry to the rightmost stage of the full adder. 
Outputs ADD(R,Q,1-35) are the sum bits. Outputs C(R,Q,1-35) are the output 


АБАС(В, О, 1-35) 


ADSR(R, О, 1-35) 


Parallel adder 


ADDIR, О, 1-35) 
-С(В, О, 1-34) 


Fig. 4.4 Names of the inputs and outputs of the parallel adder 


120 Chap.4 А FIXED-POINT ARITHMETIC UNIT 


carries from the 37 stages of the full adders, but normally they are not regarded as 
outputs. 

The parallel adder for the arithmetic unit is formed by cascading 37 stages of 
single-bit full adders. This parallel adder is shown in Fig. 4.5, where for simplicity 


C(R) c(Q) СЕ. C(33) C(34) C(35) 
epo gp еее 


Qth 1st 2nd 33rd 34th ` 35th 
stage stage stage stage stage stage 


Fig. 4.5 A parallel adder 


only carries and full adders (denoted by FA) are indicated. This parallel adder is 
described by the terminal statement in (4.2). 


4.1.4 Terminals Z 


In the subsequent description of division algorithm, a test is required for one of 
the output terminals of the parallel adder. In this case, one set of inputs of the parallel 
adder is connected from subregister AC(M) and the other set of inputs from subregis- 
ter SR(M). The output terminals of the parallel adder now called Z are described by 
the terminal statement in (4.3). 


4.1.5 Operator add2 


For convenient use of the above-described parallel adder, an operator called 
add2 is now defined. Normally, the operator is employed to add the magnitudes of 
two 37-bit binary numbers; in this case, input-carry C(35) is 0. Occasionally, it is 
employed to add the 2’s complement of a subtrahend to a minuend as a substitute of 
subtraction; in this case, input-carry C(35) is 1. Therefore, input-carry is regarded 
as another input to the parallel adder, and operator add2 is created for this purpose. 
(If addition of the 2’s complement of a number is not required, basic operator add is 
adequate to serve the purpose.) Operator add2 is defined in the statements in (4.4). 

In using operator add2, both terminals W and X should have 37 bits while ter- 
minal Y should have 38 bits. Terminal Y(36) yields the value of input carry C(35). 


4.2 Arithmetic Algorithms 


As mentioned previously, the arithmetic unit is capable of binary addition, 
subtraction, multiplication, and division of fixed-point numbers in the signed magni- 
tude representation. The employed arithmetic algorithms are now described by 
sequence charts. 


Sec. 4.2 Arithmetic Algorithms 121 


4.2.1 Fixed-point Addition and Subtraction 


Fixed-point addition adds the fixed-point addend in the SR register to the 
fixed-point augend in the AC register, while fixed-point subtraction subtracts the 
fixed-point subtrahend in the SR register from the fixed-point minuend in the AC 
register. The MQ register is not needed. 

A fixed-point addition and subtraction algorithm is shown in the sequence charts 
in Figs. 4.6 and 4.7. The algorithm may be divided into three parts, namely, initializa- 
tion, add, and subtract. The initialization part is shown in Fig. 4.6. It first changes 
the sign of the subtrahend in the SR register in the case of subtraction so that the 
subtraction can become addition later. It then resets bits AC(R,Q) to zero, and com- 
pares the signs of the augend and the addend to determine whether they are the same. 
If they are, it proceeds to the add part; otherwise, it proceeds to the subtract part. 

The add part is also shown in Fig. 4.6. In this part, the addend magnitude in 
subregister SR(M) is added to the augend magnitude in subregister AC(Q,M), and 
the result is stored in subregister AC(Q,M). The sign in the AC register remains 
unchanged. To determine whether an overflow occurs, bit AC(Q) is tested. If it con- 
tains 1, an overflow occurs and register ADOV is set to 1 to indicate the overflow. 
The addition process is now completed. If it contains 0, it is further tested to see if 
the magnitude in subregister AC(M) is zero. If it is zero, bit AC(s) is reset to zero so 
that negative zero does not occur. 

The subtract part is shown in Fig. 4.7. In this part, subtraction is performed by 
addition of 175 complement of the subtrahend іп subregister AC(M). As shown in 
Fig. 4.7, the magnitude in subregister AC(M) is first 1’s complemented and is then 
added by the magnitude in subregister SR(M), and the result is stored in subregister 
AC(R,Q,M). Bit AC(Q) is next tested. If it contains 1, it does not indicate an overflow 
because there is no overflow for a subtraction. It indicates, however, that an end- 
around carry occurs; this in turn indicates that the magnitude in subregister SR(M) 
is larger than that in subregister AC(M). In this case, the end-around carry is added 
to the accumulator, bit AC(Q) is reset to 0, and the sign bit in the accumulator is 
complemented. If bit AC(Q) contains 0, there is no end-around carry; this indicates 
that the magnitude in subregister SR(M) is either equal to or smaller than the magni- 
tude in subregister AC(M). In either case, subregister AC(M) is tested for all 1°. 
If it contains all 175, the magnitude т subregister SR(M) is equal to that in sub- 
register AC(M), and the result should be positive zero; therefore, register AC 15 
reset to 0. If it does not contain all 1’s, the magnitude in subregister SR(M) is smaller 
than that in subregister AC(M), and the result in subregister AC(M) is in the 1° 
complement form. The result is then restored to the magnitude form by being 175 
complemented. The addition process is now completed. 

There are two versions of description by the CDL, the nonprocedural and the 
procedural (10). The procedural description of the above addition and subtraction 
can be readily obtained from the sequence chart in Figs. 4.6 and 4.7 and is shown 
below. 


122 Chap.4 А FIXED-POINT ARITHMETIC UNIT 


Start 


OPCODE 


1 SUB 


SR(S)<SR(S)’ 


AC(R, Q)—-O 


Proceed to 
subtract 
part 


ADOV-«1 


End 


Fig. 4.6 Sequence chart for the initialization part and the add 
part of fixed-point addition and subtraction for binary 
numbers in the signed magnitude representation 


Comment, addition and subtraction sequence 
IF (OPCODE=0)+(OPCODE=1)) THEN (GOTO А0) ELSE 
(GOTO A2); 
[АО] IF (OPCODE- 1) THEN (SR(S)<-SR(S)’); - 


(4.7) 


Sec. 4.2 Arithmetic Algorithms 123 


AC(M)<AC(M)’ 
AC(R, О, M) -0-AC(O, М) add2 0-0-SR(M)-0 


SR(M)<AC(M) 


AC(Q)<0, 
AC(S) <AC(S)’ 


End 


Fig. 4.7 Sequence chart for the subtract part of fixed-point 
addition and subtraction for binary numbers in the 
signed magnitude representation 


AC(R,Q)-—0, IF(SR(S))#AC(S)) THEN (СОТО А); 
AC(R,Q,M) —0-AC(Q,M) add2 0-0-SR(M)-0; 
IF (AC(Q)=1) THEN (ADOV<-1, GOTO A2); 
IF (AC(M)=0) THEN (AC(S)—0, GOTO A2); 
JA] | AC(M)—AC(M)': 
AC(R,Q,M)-—0-AC(Q,M) add2 0-0-SR(M)-0 
IF (AC(Q)1) THEN (GOTO А1); 
AC(R,Q,M)-—1 add2 0-AC(Q,M)-0; 
AC(Q)—0, AC(S)-X-AC(S), GOTO A2; 
ЈА IF (AC(M)—37... 7) THEN (AC-—0) ELSE (AC(M)—AC(M)); 
[A2] END 


As will be shown later, the above OPCODE is a two-bit register storing the op-code 


124 Сһар.4 А FIXED-POINT ARITHMETIC UNIT 


of an instruction. When op-code is 0, 1, 2, and 3, it denotes addition, subtraction, 
multiplication, and division, respectively. 

In the above procedural description, neither timing nor control signals are 
included. Each execution statement which allows one or more micro-statements 15 
terminated by a semicolon. The order of these statements is the order of the operation 
of the sequence. Symbolic labels AO, A, Al, and A2 are arbitrarily chosen for reference 
purpose, and a GOTO statement is used to refer the labels for changing the order 
of the sequential operation. 


4.2.2 Fixed-point Multiplication 


Fixed-point multiplication multiplies a fixed-point multiplier in the MQ register 
by a fixed-point multiplicand in the SR register, and produces the product in the 
casregister combined by the AC and MQ registers. The most significant part of the 
product is stored in the AC register and the least significant part in the MQ register. 
The multiplicand remains in the SR register, while the multiplier in the MQ register 
is lost. 

Figures 4.8 and 4.9 show the sequence charts for fixed-point multiplication of 
binary numbers in the signed magnitude representation. The multiplication algorithm 
may be divided into two parts: the initialization part in Fig. 4.8 and the multiplication 
part in Fig. 4.9. During the initialization part, the signs of the multiplicand and the 
multiplier are compared. If they are equal, bit AC(S) is reset to zero; otherwise, it is 
set to 1. The magnitude part of the AC register is then set to zero. It is then followed 
by handling zero multiplicand and zero multiplier. If the multiplicand in subregister 
SR(M) is zero, then the multiplier in subregister MQ(M) is set to zero. Whether the 
multiplicand is zero or not, subregister MQ(M) is tested next for zero. If it is zero, 
the product in registers AC and MQ should be zero; therefore, only bits AC(S) and 
МО(5) need to be reset to zero and the process of multiplication is terminated. Other- 
wise, it proceeds to the multiplication part. 

The multiplication part is shown in the sequence chart in Fig. 4.9. The multipli- 
cation makes use of the repeated addition method by having two multiplier bits 
examined at one time. For a multiplier with a magnitude of 35 bits, only 18 addition 
cycles are required. As shown in Fig. 4.9, initially, shift counter SC is set to 18 and 
control register C is reset to zero. Furthermore, bit МО(5) is reset to 0 and is regarded 
as a multiplier magnitude bit so that the magnitude of the multiplier becomes 36-bit. 
In this way, the shift micro-operations required at the end of the multiplication 
sequence can be simplified. The multiplication then begins by examining bits M(34, 35) 
which can be 00, 01, 10, or 11. If they are 00, there is no addition of the multiplicand. 
If they are 01, the multiplicand is added to the accumulator. If they are 10, the multi- 
plicand is shifted 1 bit position to the left and then added to the accumulator. If 
they are 11, adding the multiplicand three times 15 required: Instead of adding the 
multiplicand three times, it can be added four times and then subtracted once. To add 
the multiplicand four times is equivalent to incrementing the next two more significant 
multiplier bits by one during the next addition cycle; this is remembered by setting 


Sec. 4.2 Arithmetic Algorithms 125 


Start 


АС(5)<1, 


Proceeds to 
multiplication 
part 


MQ(M)=0 
МО- 0, 
АС(5)<0, 


End 


Fig. 4.8 Sequence chart for the initialization part of fixed-point 
multiplication for binary numbers in the signed magni- 
tude representation 


C to 1. To subtract the multiplicand once is accomplished by adding 2’s complement 
of the multiplicand; this gives a partial product which can be negative. The addition 
of 2’s complement of the multiplicand is accomplished by adding 17$ complement of 
the multiplicand and, at the same time, by making input carry C(35) equivalent to 1. 

After the multiplicand is added, not added, or subtracted, casregister AC(R,Q,M)- 
MQ(M) is shifted two bit positions to the right and shift counter SC is decremented 
by 1. This right shift must follow the rule of arithmetic shift because the partial pro- 
duct can be a magnitude (indicated by C(R) being a 0) or a 2's complement (indicated 
by C(R) being а 1). The shift counter is next tested for 0. If it does not contain 0, 
register C is tested next for 1. If register C contains 1, this means that multiplier bits 
MQ(34, 35) require a correction. This correction changes bits MQ(34, 35) from 00, 
01, 10, and 11 to 01, 10, 11, and 00, respectively, and, at the same time, if the change 
is from 00 to 01 or from 01 to 10, register C is reset to 0. Bits MQ(34, 35) are again 
tested. An addition, no addition, or a subtraction of the multiplicand is accordingly 
again performed; the casregister is again shifted two bit positions to the right; shift 
counter SC is again decremented by one and then tested for 0. If it is not 0, the steps 


IF (MQ(34, 35)=00) THEN (МО(34, 35)<-01, С-0), 
IF (MQ(34, 35)-01) THEN (MQ(34, 35)— 10, C—O), 
IF (MQ(34, 35)=10) THEN (MQ(34, 35) 11), 

IF (MQ(34, 35)=11) THEN (MQ(34, 35)<-00) 


MQ(34, 35) 


АС(В, Q, М)-АС{В, Q, M) add2 0-0-SR(M)-0 


AC(R, Q, М)-АС(В, Q, M) add2 0-SR(M)-0-0 


AC(R, О, M) -0-0-AC(M) add2 1-1-SR(M)’-1 
С<+ 1 


AC(2-35)-MQ<AC(Q, М)-МО (5, 1-33), 
АС(О, 1)-АС(В)-АС(В), 
SC<countdn SC, 


АС(М)<АС(2-35)-МО(6), 
МО(5)-АС($), 


End 


Fig. 4.9 Sequence chart for the multiplication part of fixed- 
point multiplication for binary numbers in the signed 
magnitude representation к 


126 


Sec. 4.2 Arithmetic Algorithms 


127 


of correcting multiplier bits, etc., are repeated until shift counter SC contains a 0. 
At this time, subregister AC(M) is shifted one bit position to the left and the product 
sign is inserted into bit MQ(S). The multiplication is now completed. Note that bits 
MQ(34, 35) during the eighteenth time can be 00 or 01 only because MQ(S) has 


initially been reset to 0. 


To illustrate the sequential operations of the multiplication process as described 
by the sequence charts in Figs. 4.8 and 4.9, an example showing step-by-step changes 
in the registers is shown in Fig. 4.10. In this example, we have 


1. set AC(S) to 1. 


2. set AC(R, О, М) to 0, 


SR(S) SR(M) 


БІ [тоот 


ББ ionn 


|0|0|000000/110111 


zero fraction test. no zero fraction. multiplication proceeds. 


3. reset С to 0. set SC to 11. 


4. MQ(34, 35)=11: set C to 1, 
add 2’s complement of SR(M). 


5. shr 2 bits. 


6. C=1: set M(34, 35) to 10. 
reset С to 0 


7. MQ(34, 35)=10: shift SR (М) 
1-bit left and then add SR(M). 
8. shr 2 bits. 


9. MQ(34, 35)=01: add SR(M). 


10. shr 2 bits. 


11. 5С-0: MQ(S) -AC(S). 
multiplication complete. 


C 
l'o[o[oooooo[110111 [o] 
[1 [1[001101/1101 11] 


[2 [о[ттобтт[бтоттт 


[o[o10010[0 1010 !| 
- 


-]9]o]o16019[o 10191 


Fig. 4.10 Fixed-point multiplication of binary numbers in the 
signed magnitude representation 


128 Chap.4 А FIXED-POINT ARITHMETIC UNIT 


multiplicand = —110011 = —51 
multiplier = +010111 = +23 
product = —010010010101 = —1173 
In Fig. 4.10, notice the right shift which follows the rule for arithmetic shift of binary 
numbers in the 2’s complement.representation. 
Instead of sequence charts, the multiplication process can also be described in a 
procedural manner. A procedural description is shown below: 
Comment, multiplication sequence (4.8) 
IF (SR(S)=MQ(S)) THEN (AC(S)—0) ELSE (AC(S)—1); 
AC(R,Q,M)—0; 
IF (SR(M)=0) THEN (GOTO ВО); 
IF (MQ(M)+0) THEN (СОТО В); 
/BO/ MQ<0,AC(S)<0,GOTO B2; 
[B] 5С--18, С- 0, MQ(S) —0; 
/B1/ IF (C=1) THEN (IF (M(34,34)=00) THEN (М(34,35)<-01,С<-0), 
IF (M(34,35)=01) THEN (M(34,35)<-10,C<0), 
IF (M(34,35)—10) THEN (М(34,35) —11), 
IF (M(34,35)— 11) THEN (М(34,35)<00); 
IF (MQ(34,35)=01) THEN (AC(R,Q,M)<-AC(R,Q,M) add2 
0-0-5Е(М)-0), 
IF (MQ(34,35)= 10) THEN (AC(R,Q,M)-—AC(R,Q,M) add2 
0-SR(M)-0)-0-0), 
IF (MQ(34,35)—11) THEN (C—1, AC(R,Q,M)-—0-0-AC(M) add2 
1-1-SR(M)’-1); 
AC(2-35)-MQ<AC(Q,M)-MQ(S, 1-33), AC(Q)-AC(R); 
SC<—countdn SC; 
IF (SC40) THEN (GOTO B1); 
MQ(S)<-AC(S); 
AC(M)-—AC(2-35)-MQ(S); 
/B2/ END 


4.2.3 Fixed-point Division 


Fixed-point division divides a fixed-point dividend in the caseregister combined 
with the AC and MQ registers by a fixed-point divisor in the SR register, and produces 


Sec. 4.2 Arithmetic Algorithms 129 


a fixed-point quotient in the MQ register and a remainder in the AC register. The 
divisor remains in the SR register, while the dividend in the caseregister is lost. The 
division employs a comparison method. 

Figures 4.11 and 4.12 show the sequence charts for fixed-point division of binary 
numbers in the signed magnitude representation. The charts may be divided into 
two parts: the initialization part in Fig. 4.11, and the division part in Fig. 4.12. In 


Start 


AC(R, О, M) -0-0-AC(M)' 


Proceeds to divide part 


Fig. 4.11 Sequence chart for the initialization part of fixed- 
point division for binary numbers in the signed magni- 
tude representation 


the initialization part, division overflow is tested, and the sign of the quotient is 
determined. The division overflow is determined by a subtraction test which subtracts 
the dividend in subregister AC(M) from the divisor in subregister SR(M). The sub- 
traction test is actually performed by addition of 1’s complement of the dividend in 
subregister AC(M). Recall that terminal statement (4.6) has been declared to define 
the sum terminals of the parallel adder with 0-0-АС(М) and 0-0-SR(M) and C(35) 
being 0 as the inputs. That terminal statement makes it possible to refer to sum ter- 


130 Chap.4 А FIXED-POINT ARITHMETIC UNIT 


SR(M)>AC(M) 


SR(M)<AC(M) 


MQ(35)«1 
АС(В, О, M)<0-0-AC(M) add2 0-0-SR(M)-0 


SC<countdn SC 


AC(M)<AC(M)’ 


End 


Fig. 4.12 Sequence chart for the divide part of fixed-point 
division for binary numbers in the signed magnitude 
representation 


minals Z(Q) in the division overflow test. If the subtraction test shows that the sum 
terminal is 0, this indicates that the dividend in subregister AC(M) is larger than or 
equal to the divisor in subregister SR(M). Therefore, a division overflow occurs and 
register DVOV is set to 1 to indicate the overflow; the division process is terminated. 
If the subtraction test shows that sum terminal Z(Q) is 1, it proceeds to determine 
the sign of the quotient. The sign of the quotient is 0 if the signs in bits SR(S) and 
AC(S) are the same; otherwise, it is 1. It then proceeds to the division of the dividend 
by the divisor. 

In the division part, the partial remainder in subregister AC(M) remains in the 
ls complement form. The division begins by setting shift counter SC to octal 43 (or 


Sec. 4.2 Arithmetic Algorithms 131 


decimal 35). Casregister AC(M)-MQ(M) next is shifted one bit position to the left; 
during the shift, МОСТ) is complemented and then shifted into bit AC(35) because 
the partial remainder in subregister AC(M) is being kept in the 1’s complement 
form. The divisor in subregister SR(M) is compared with the partial remainder in 
АС(М). If this comparison shows that terminal Z(Q) is 0, then the partial remainder 
in subregister AC(M) is larger than or equal to the divisor in subregister SR(M). 
In this case, bit MQ(35) is set to 1, and at the same time the divisor in subregister 
SR(M) is added to the partial remainder in subregister AC(M). If terminal Z(Q) is 
1, it indicates that the divisor in subregister SR(M) is greater than the partial remainder 
in subregister AC(M) and no operation is performed. Shift counter SC is next decre- 
mented by 1 and then tested for 0. If counter SC does not contain 0, these steps of 
left shifting, comparing, adding, inserting one, decrementing, and testing counter 
SC are repeated until counter SC contains 0. At this time, the partial remainder in 
subregister AC(M) is restored to the magnitude form by being 1’s complemented 
once more. The division process is now completed. 

To illustrate the sequential operations of the division process as described by 
the sequence charts in Figs. 4.11 and 4.12, an example showing step-by-step changes 
in the registers is shown in Fig. 4.13. In this example, we have 


dividend = +00001111 = +15,, 
divisor = —0011 = —3,, 
quotient = —0101 = —5,, 
remainder = +0000 = 0 
In Fig. 4.13, notice that the division overflow test is first made before the division 
actually begins. 


Instead of using sequence charts, a procedural description of the multiplication 
process is shown below. 


Comment, division sequence (4.9) 
[СО] AC(R,Q,M)-—0-0-AC(M)'; 
ТЕ (Z(Q)=1) THEN (DVOV —1,GOTO C2); 
IF (SR(S)=AC(S)) THEN (MQ(S)-—0) ELSE (MQ(S)—1); 
ІСІ SC<35; 
/СІ/ АС(М)-МО(М)-АС(2-35)-МО(1)-М0(2-35)-0; 


IF (Z(Q)=0) THEN (MQ(35).— 1,AC(R,Q,M)--0-0-AC(M) add2 
0-0-SR(M)-0; 


SC<countdn SC; 
IF(SC+40) THEN (GOTO СІ); 
АС(М)-АС(М) 

/C2/ END 


132 Chap.4 А FIXED-POINT ARITHMETIC UNIT 


1. 1's complement of AC(M). 001111111111 
subtraction test, Z(Q)71; division process proceeds. 
` H SC 
2. set MQ(S) to 1. 11111111 1100 
division begins. 
3. shift left. 11101110 
subtraction test, 2(0)=1; AC(M)<SR(M). 
4. no operation, а, =0. 11101110 
а; 
5. decrement SC. 11101110] 1011 
6. shift left. 11001100 
subtraction test, 2(О)-0; AC(M)ZSR(M). 
7. add, а„=1. [11111101 
ауа» 
8. decrement SC. 111111011 |010 
9. shift left. 11101010 
subtraction test, 2(0}=1; AC(M)<SR(M). 
10. no operation, 4; -0. 111011010 
949293 
11. decrement SC. 1110/1010; |001 
12. shift left. 11000100 
subtraction test, Z(Q)=0; AC(M)ZSR(M). 
13. add, q,=1. 11000101 
3,353504 
14. decrement SC. [11110101 


15. 1's complement of AC(M). 
division complete. 


ES 
С 
[O | 
o 
o 
o 
о 
о 
о 


Fig. 4.13 Fixed-point division of binary numbers т the signed 
magnitude representation 


4.3 CDL Descriptions 


The algorithms for addition, subtraction, multiplication, and division as shown 
by sequence charts are now described in the CDL statements. The timing and control 
signals are first described, followed by the addition and subtraction sequence, the 
multiplication sequence, and the division sequence. 


Sec. 4.3 CDL Descriptions 133 


4.3.1 Timing and Control Signals 


The configuration for sequencing the addition, subtraction, multiplication, and 
division processes consists of a 4-bit timing register D, a 4-bit OPCODE register, 
their associated decoders, a 4-bit wait counter WC, and a memory M with address 
register AD. The memory has 32,768 36-bit words. Furthermore, there are six single- 
bit registers, I, READ, SI, N, W, and Y. Register I indicates a fetch cycle when it 
is 0, and an execution cycle when it is 1. Register READ commands memory READ 
operation when it is set to 1. Registers SI, М, W, and Y are for subsequencing, as 
will be shown later. In addition, there are clock P and switch START. 

The configuration for sequencing the arithmetic unit is shown in Fig. 4.14 and 
is described by the CDL statements below: 


Comment, control configuration of the fixed-point arithmetic unit (4.10) 
Register, D(0-3), $timing register 
AD(0-14), $address register of the memory 
OPCODE(0-3), S$op-code register 
I, $when 0, fetch cycle; when 1, execution cycle 
READ, $command memory read 
SI, $sign indicator 
WC(0-3), $wait counter 
N, $control register 
W, $control register 
Y, $control register 
Memory, M(AD)= М(0-32767,0-35), 
Decoder, K(0-11)— D, 
J(0-11) -OPCODE, 
Switch, START(ON), 
Clock, P(1-2), 
Terminal, ADS -(J(0) 4-J(1))«I, 
MPY —J(2)*I, 
DIV —J(3)«1I, 
/START(ON)/ D<0, 
/P(2)/ D<countup D, 


The above configuration generates a sequence of control signals for sequencing 
12 steps by counter D, its associated decoder, and the two-phase clock P. These 12 
control signals are represented by labels K(0)*P(1), K(D«P(1), ..., K(H)«P(1). The 
completion of these 12 steps will be referred to as an execution cycle. 


АО (0-14) 


Control 


signals READ m 


2 
2 


Decoder 


J; 
Ей 
Ш 


Fig. 4.14 Control configuration of the arithmetic unit 


DIV 


MPY 


ADS 


134 


Sec. 4.3 CDL Descriptions 135 


The sequencing begins when switch START is turned to the ON position; the 
START switch resets register D to 0 and in turn generates the control signal for the 
first step, К(0). Each of these 12 control signals is generated at the first phase of 
the clock, P(1), while counter D is incremented by 1 at the second phase P(2); this is the 
manner in which the sequence of control signals is advanced. When counter D reaches 
11, register D will be set to 15 by clock P(1) during the last of the 12 steps if the 
sequence is to be terminated. Setting register D to 15 causes generation of another 
12-step control signal. Except for the setting of the D-to-15 micro-operation which 
is to be specified at the end of each sequence, the micro-operations required for the 
counter D are described by the above two execution statements. The commands for 
the addition and subtraction sequence (ADS), for the multiplication sequence (MPY), 
and for the division sequences (DIV) are generated by register OPCODE, its associated 
decoder, and register I. 

In the subsequent descriptions of arithmetic processes, the time for the 12-step 
operation of an execution cycle is chosen to be equal to the memory cycle time so 
that there is only one cycle time to deal with. Furthermore, each execution cycle 
coincides with each memory cycle. This choice keeps the memory in operation one 
memory cycle after another, just as the arithmetic unit is kept in operation one 
execution cycle after another. 

It is assumed that, when the memory is commanded to read during the first 
step, the word from the memory becomes available in register SR at the end of the 
fourth step. 


4.3.2 Addition and Subtraction Sequence 


The fixed-point addition and subtraction process has been described in the se- 
quence charts in Figs. 4.6 and 4.7. The description of the process by the CDL state- 
ments is obtained when a control signal is assigned to each step of the addition and 
subtraction process. The control signals for the first six steps are ADS«K(0)«P(1), 
..., АО$*К(5)*Р(1). At this point, the initialization part is completed and the 
process is branched into an add part and a subtract part as shown in Figs. 4.6 and 
4.7. The sequencing of the remaining six steps of the process is likewise branched 
into two 6-step subsequences by means of register SI. The control signals are 
SI'«ADS«K(6)«P(1), ..., SU«ADS*K(11)«P(1) for the add part and SI«ADS«K(6) 
*Р(1)... , SIKADS«K(11)«P(1) for the subtract part. With these control signals, the 
addition and subtraction process described by the CDL statements is: 


Comment, addition and subtraction sequence (4.11) 
/ADS+*K(0)*P(1)/ READ-1, 

/ADS«*K(1)*P(1)/ М0, W—0, Y<-0, WC<0, 

/ADS*K(2)*P(1)/ AC(R,Q)<0, 

/ADS+*K(3)*P(1)/ SR—M(AD), 


/ADS*K(4)«P(1)/ IF (OPCODE= I) THEN (SR(S)- SR(S)), 


136 Chap.4 А FIXED-POINT ARITHMETIC UNIT 


/ADS«*K(5)*P(1)/ IF (SR(S)=AC(S)) THEN (SR<0) ELSE (61-І), 
Comment, add part 
/[SY*«ADS«K(7)*P(D/ AC(R,Q,M)<0-AC(Q,M) ааа2 0-0-SR(M)-0, 
/SV«ADS«K(8)«P(1) IF (AC(Q,M)=0) THEN (AC(S)—0), 
/SI'xADS*K(9)*P(1)/ ТЕ(АС(О)= 1) THEN (ADOV —L), 
/[/ADS*K(11)«P(1)/ I—0, D—15, 
Comment, subtract part 
/SIxADS«K(6)*P(1)/ . AC(M)—AC(M), 
/SIxADS*K(7)*P(1)/  AC(R,Q,M)—0-AC(Q,M) add2 0-0-SR(M)-0, 
/SIxADS«K(8)*P(1)/ ТЕ (AC(Q)=0) THEN (IF (АС(М)=37...7,) 
THEN (AC<0) ELSE (AC(M)-—AC(M))), 
[SIXADS«K(9)«P(1)  IF(AC(Q)—1) THEN (AC(R,Q,M)—1 а442 
0-AC(Q,M)-0,AC(S)-—AC(S)), 
/SIxADS*K(10)*P(1)/ AC(Q)—0, 
END 
In the above description, the memory read operation is commanded during the 
first step, and the word becomes available from the memory during the fourth step 
as assumed previously. During the last step, register D is set to 15 so that another 
cycle will begin, and this cycle will coincide with a memory cycle. Furthermore, 


register I is reset to 0 so that the next cycle is a fetch cycle to receive the next instruction 
from the memory. 


4.3.3 Multiplication Sequence 


The fixed-point multiplication process is described in the sequence charts in 
Figs. 4.8 and 4.9. The control signals for sequencing this process are МРҮжК(0)«Р(1), 
...,and MPY«K(11)«P(1). The memory word is again assumed to become available 
at the end of the fourth step. With these control signals, the multiplication process 
described by the CDL statements is as follows: 


Comment, multiplication sequence (4.12) 
/MPY*K(0)«P(1)/ READ-1, 
[МРҮ *К (1)*Р(1)/ М0, W<0, Ү<-0, WC—90, 
/MPY«*K(2)«P(I)/ AC(R,Q,M)-—0, 
/MPY«K(3)«P(D/ SR<M(AD), . 
/MPY*K(4)*P(1)/ IF (SR(S)=MQ(S)) THEN (AC(S)—0) 
ELSE (AC(S)—1), 


Sec. 4.3 CDL Descriptions 137 


/MPY«*K(5)*P(1)/ IF ((5Е(М)--0)--(МО(М)--0)) THEN (MQ<0,AC(S) 
<-0,Ү<-1) ELSE (W —1), 
/MPY*K(11)«P(1)/ IF (Y=1) THEN (Ү<-0, I-—0, D-—15), 
IF (W=1) THEN (N—1, W—0, D«—5), 
Comment, multiplication part 
/N«MPY*K(6)*P(I) | SC—18, С<-0, MQ(S)—0, 
/N«MPY«K(7)«P(D/ IF (C=1)*(MQ(34,35)=00) THEN (M(34,35) 
—01,С<—0), 
IF ((С=1)*(МО(34,35)=01) THEN (M(34,35) 
<10,C<0), 
IF ((C= 1)*(MQ(34,35)= 10) THEN (М(34,35)<—11), 
IF ((C=1)*(MQ(34,35)=11) THEN (M(34,35)-—00), 
/N*MPY*K(8)*P(1)/ IF (MQ(34,35)=01) THEN (AC(R,Q,M) 
<-AC(R,Q,M) add2 0-0-SR(M)-0), 
IF (MQ(34,35)=10) THEN (AC(R,Q,M) 
<-AC(R,Q,M) add2 0-SR(M)-0-0), 
IF (MQ(34,35)=11) THEN (AC(R,Q,M) 
<0-0-АС(М) add2 1-1-SR(M)'-1,C—1), 
/N*MPY*K(9)*P(1)/ АС(2-35)-МО<-АС(О,М)-МО(5,1-33), 
AC(Q,1)<-AC(R)-AC(R),SC<countdn SC, 
/N*MPY*K(10)*P(1)/ IF (SC=0) THEN (MQ(S)<-AC(S), AC(M)<AC(2-35) 
-MQ(S)) ELSE (О<-6), WC<-4, 
/NxeMPY«K(11)*P(1)/ IF (WC=0) THEN (N<0,I<-0,D<—15) 
ELSE (D<—10,WC<countdn WC), 
END 


The above description consists of two parts, the initialization described in the 
sequence chart of Fig. 4.8, and the multiplication described in the sequence chart of 
Fig. 4.9. If the multiplicand or the multiplier or both are zero, the product is zero. 
Register Y is set to 1. The fixed-point multiplication process is terminated at the last 
step of the first execution cycle. If neither the multiplicand nor the multiplier is zero, 
register М is set to 1 to initiate the multiplication subsequence. 

In the multiplication part, there is a loop which consists of four steps. This 
multiplication loop is sequenced by a series of six control signals, N«MPY «K(6)«P(1), 
..., and N«MPY«K(11)«P(1). During each iteration, shift counter SC is tested for 
zero. If it is not zero, counter D is set to 6 so that the series of control signals for the 
subsequence are generated once more. There are 18 iterations for the multiplication 


138 Chap.4 А FIXED-POINT ARITHMETIC UNIT 


loop. When the shift counter becomes zero, this particular step falls on the seventh 
step of an execution cycle. At this step, counter WC is set to 4. As described in the 
last statement of the above description (4.12), counter WC is next decremented in 
this one-statement loop until it reaches 0. At this time, the sequence is precisely at 
the last step of that execution cycle. Registers N and I are then both reset to 0 while 
register D is set to 15 to start another cycle. 


4.3.4 Division Sequence 


The fixed-point division process is described in the sequence charts in Figs. 
4.11 and 4.12. The control signals for sequencing the initialization part are 
DIV*K(0)«P(1), ..., DIV«K(4)«P(I) and DIV«K(I1)«P(1). Those for sequencing the 
division part are N«DIV«K(8)«P(1), ..., and N«DIV«K(I1)«P(I). The memory word 
is again available at the end of the fourth step of the first execution cycle. With these 
control signals, the fixed-point division described by the CDL statements is as follows: 


Comment, division sequence (4.13) 
/DIV*K(0)*P(1)/ READ<1, 
/DIV*K(1)*P(1)/ №0, N—0, Ү<-0, WC<0, 
/DIV*K(2)*P(1)/ AC(R,Q,M)<-0-0-AC(M)’, 
/DIV«K(3)«P(1) SR<M(AD), 
/DIV«K(4)*P(1)/ IF (Z(Q)=0) THEN (DVOV<-1,W<1) 
ELSE (5С—35,М 1), 
IF (SR(S)=AC(S)) THEN (MQ(S)<0) 

ELSE (МО(5)< 1), 
/ОТУ*К(11)*Р(1)/ IF (W=1) THEN (W<0,I<-0,D<15), 
Comment, division part 
/N*DIV«K(8)&P(I) АС(М)-МО(М)«-АС(2-35)-МО(1)-МО(2-35)-0, 

SC<countdn SC, 
/NxDIV«K(9)«P(1) ТЕ (Z(Q)=1) THEN (AC(R,Q,M)-—0-0-AC(M) add2 
0-0-SR(M)-0, MQ(35) 1), 
/N*DIV*K(10)*P(1)/ IF (SC+0) THEN (D<7) 
ELSE (AC(M)<-AC(M)’,WC<-6), 
/N*DIV«K(11)«&P(1) ТЕ (WC=0) THEN (N<0,I<-0,D<15) 
ELSE (D-—10,WC--countdn WC), 
END 


In the above description, if division overflow occurs in the initialization part, 
register W is set to 1, and the fixed-point division is terminated at the end of the 


Sec. 4.4 А Parallel Adder with Group and Section Carries 139 


first execution cycle. Otherwise, register М is set to 1 to initiate the division subse- 
quence. In the division subsequence there is a three-step division loop. During each 
iteration of the loop, shift counter SC is tested for zero. If it is not zero, counter D is 
set to 7 so that the sequence of the control signals of the subsequence is repeated. 
There are 35 iterations for the division loop. When the shift counter becomes zero, 
this particular step falls on the fifth step of an execution cycle, and counter WC is 
set to 6. Counter WC is next decremented in the one-statement (the last statement 
in (4.13)) loop until it reaches 0. At this time, the sequence is at the last step of that 
execution cycle and is terminated in the same manner as the multiplication sequence. 


4.4 A Parallel Adder with Group and Section 
Carries 


А parallel adder has been described by terminal statements (4.3) and (4.4) and 
shown in Figs. 4.4 and 4.5. Since the addition time of a parallel adder is closely related 
to the computing speed of an arithmetic unit, this section describes, as an illustration, 
another parallel adder which makes use of group carries and section carries for the 
purpose of achieving a shorter addition time. 

In the following discussion, the term /eve/ is employed to describe one logical- 
and circuit or one logical-or circuit. For example, when a signal goes through ten 
such logical circuits in cascade, the signal is said to go through ten levels. The more 
levels a signal goes through, the longer time it takes for the signal to propagate 
through. Therefore, the number of levels a carry or a sum bit has to propagate 
through is used here as a measure of time. 


4.4.1 Single-bit Full Adder 


Boolean functions for a full adder have been described by (4.1) and (4.2). Instead 
of these two equations, a full adder can also be described in another way as shown 
below. 


Let P(i)=ADSR(i)+ADACC(i) (4.14) 
and CX(i)=ADSR(i)QADAC(i) (4.15) 
or CX()—ADSR(i)'*ADAC()-ADSR()*ADAC() (4.16) 


P(i) becomes 1 when augend bit ADAC(i) or addend bit ADSR(i) is 1. Thus, Ра) 
represents the condition that the ий full adder propagates the ith local carry if it 
occurs; this is called carry propagate. Thus P(i)«C(i) represents the input carry C(i) 
which has propagated through the ий stages. CX(i) becomes 1 when the augend and 
the addend bits are both 0 or both 1. Then, P(i)*CX(i) represents a generated carry 
because P(i)«CX(i) is 1 only when the augend and addend bits are both 1. Since the 
output carry C(i-1) is either a generated carry or a propagated carry, we have 


CG-1)—P()«(CX()4-C(i)) 


140 Chap.4 А FIXED-POINT ARITHMETIC UNIT 


The full adder can now be described by the following two Boolean functions: 


ADD()-—CX()9C() (4.17) 
C(i-1)- P()«(CXG)-- C()) (4.18) 


Note that the sum bit described by (4.17) is identical to that described by (4.5) when 
CX(i) in (4.17) is substituted by that in (4.15). 


4.4.2 Organization of the Parallel Adder 


A parallel adder can be organized by using the single-bit full adder described by 
the above Boolean functions (4.17) and (4.18). A 37-bit parallel adder so organized 
can be concisely described by the following terminal statement. 


Terminal, P(Q,1-35) - ADSR(Q,1-35)-- ADAC(Q,1-35), (4.19) 
CX(R,Q,1-35)=ADSR(R,Q,1-35)@ADAC(R,Q, 1-35), (4.20) 
ADD(R,Q,1-35)=CX(R,Q,1-35)@C(R,Q, 1-35), (4.21) 
C(R,Q, 1-34) — P(Q,1-35)«(CX(Q,1-35) + C(Q,1-35)), (4.22) 


The above carries described by (4.22) are to be referred to as local carries. A 
logic diagram showing the local carries of the parallel adder can be constructed from 
the above terminal statement; it is shown in Fig. 4.15 where there is a line along 


P(Q) CX(Q) Р(1) CX(1) Р(2) Р(34) СХ(34) РХ(35) СХ(35) 


Fig. 4.15 The line of carry propagation of the parallel adder 


which the carry propagates. This line is known as the /ine of carry propagation, and 
the parallel adder is referred to as one with the gated carry. For a local carry to 
propagate along the line, it must pass two levels per stage. Thus, local carries C(35), 
C(34), С(33),..., C(D), C(Q), and C(R) may occur after propagating through 0, 2, 
4,..., 68, 70, and 72 levels, respectively. These are ten local carries C(35),..., 
C(3), and C(R) which will be of particular interest. These ten local carries and the 
levels that they may propagate through are shown in the first and second columns of 
Table 4.1. Similarly, it can be shown that the signals on the sum terminals ADD(35), 
ADD(34), ADD(33), ..., ADD(2) and ADD(1) occur after propagating through 
2, 4, 6,..., 68, and 70 levels, respectively. 

The carry propagation in the above parallel adder takes too much time to be 
desirable, because the longer the propagation time, the longer the addition time. 
Since multiplication is accomplished by repeated additions, division by repeated 


Sec. 4.4 A Parallel Adder with Group and Section Carries 141 


subtractions, and subtraction by addition of complements, the reduction of carry 
propagation time in turn reduces the time of addition, subtraction, multiplication, 
and division. For this reason, the reduction of carry propagation time in a parallel 
adder is of great importance to the speed of an arithmetic unit. 

One approach to reduce the carry propagation time is to organize stages of full 
adders into groups. Let every four stages of full adders be combined into a group. 
Thus, the 35th, 34th, 33rd, and 32nd stages are combined into group 1; the 315%, 
30th, 29th, and 28th stages into group 2; and so forth. For the 37 stages of full adders, 
there are 10 groups with group 10 having but one stage. Furthermore, let every three 
groups be combined into a section. Thus, groups 1, 2, and 3 are organized to form 
section 1; groups 4, 5, and 6 to form section 2; and so forth. For the ten groups, 
there are four sections with section 4 having but one stage of full adder. 

Each group is associated with a separate logic network for generating input 
group carries, IG(1), IG(2),...,1G(10). The block diagram іп Fig. 4.16 indicates 


1G(3) IG(2 1 
IG(6) IG(5) 16 (4) 
| Сгоир 6 Group 5 Group 4 
16 (9) 16 (8) 16 (7) 
Group 9 | Group 8 Group 7 
| | IG(10) 


Fig. 4.16 Logic network for generating group carries of the nine 
groups in the three sections 


the logic networks for the ten groups in the four sections. These ten input group 
carries replace, respectively, the ten local carries C(35), C(G1),..., C(3), and C(R) 
for the purpose of reducing the carry propagation time. The ten input group carries 
and the levels that they may propagate through after the ten local carries are replaced 
by the input group carries are shown in the third and fourth columns of Table 4.1. 
Notice that the levels that the local carries may have to propagate through are greatly 
reduced. For example, the levels for carry C(R) are reduced from 72 to 21 levels. 
Boolean equations for describing these input group carries will be derived later. 
The terminal statement which defines these input group carries is presented below. 


142 Chap. 4 А FIXED-POINT ARITHMETIC UNIT 


Terminal, IG(1)=C(35) (4.23) 
IG(2)=G(1)+1G(1)*PG(1) (4.24) 
IG(3)2GQ)--IGQ)«PG(2) (4.25) 
IG(4)-G(3)--IGG)«PG(3) (4.26) 
IG(5)=G(4) + 1G(4)*PG(4) (4.27) 
IG(6)— G(5)--IG(5)«PG(5) (4.28) 
IG(7)=G(6) + 1G(6)*PG(6) (4.29) 
IG(8)=G(7)+1G(7)*PG(7) (4.30) 
IG(9)— G(8)--IG(8)«PG(8) (4.31) 
IG(10=G(9) + IG(9)*PG(9) (4.32) 


where G(1), G(2),..., and G(9) are called group carries and PG(1), PG(2),..., 
PG(9) are called group propagates. All of these are generated by the logic networks 
in Fig. 4.16. 

Each section is also associated with a separate logic network for generating 
input section carries, IS(1), . . . , IS(4). The block diagram in Fig. 4.17 indicates the 


IS(1 
x 
2 IS(2) 
Section 2 
| 1S(3) 
Section 3 


15(4) 


D 


Fig. 4.17 Logic network for generating section carries for the 
three sections 


logic networks for the three sections. In order to further reduce the carry propagation 
time, these input section carries replace four input group carries IG(1), IG(4), ІС(7), 
and IG(10). The six input group carries, the four input section carries, and the levels 
which they may propagate through are shown in the fifth and the sixth columns of 
Table 4.1. As shown in that Table, the number of the levels that these carries may 


Sec. 4.4 А Parallel Adder with Group and Section Carries 143 


TABLE 4.1 Input Group and Section Carries and Their 


Levels 
INPUT GROUP OR 
LOCAL CARRY INPUT GROUP CARRY SECTION CARRY SUM Bit 
Name Level Name Level Name Level Name Level 
C(35) 0 IG(1) 0 IS(1) 0 ADD(35) 2 
C31) 8 IGQ) 5 ІС(2) 5 ADD(Q1) 7 
C(27) 16 IGQ) 7 1С(3) 7 ADD(27) 9 
C(23) 24 IG(4) 9 IS(2) 7 ADD(23) 9 
C(19) 32 IG(5) 11 IG(5) 9 ADD(19) 11 
C(15) 40 ІС(6) 13 ІС(6) 11 ADD(15) 13 
C(11) 48 IG(7) 15 IS(3) 9 ADD(11) 11 
C(7) 56 IG(8) 17 IG(8) 11 ADD(7) 13 
C(3) 64 IG(9) 19 IG(9) 13 ADD@) 15 
C(R) 72 IG(10) 21 IS(4) 11 ADD(R) 13 


have to propagate through are further reduced. The maximum number of levels is 
reduced from 21 in the fourth column to 13 in the sixth column. No carry has to 
propagate through more than 13 levels, and, as shown in the seventh and eighth 
columns of Table 4.1, no sum bit has to propagate through more than 15 levels. 
Boolean equations for describing these input section carries will be derived later. 
The terminal statement which defines these input section carries is presented below. 


Terminal, IS(1)=C(35) (4.33) 
1S(2)=S(1)-++-1S(1)*PS(1) (4.34) 
1S(3)=S(2)-+-1S(2)*PS(2) (4.35) 
1S(4)=S(3)-FIS(3)*PS(3) (4.36) 


where S(1), S(2), and S(3) are called section carries, and PS(1), PS(2), and PS(3) are 
called section propagates. АП of these are generated by the logic networks in Fig. 4.17. 


4.4.3 Terminal Statements for the Parallel Adder 


The parallel adder with both group and section carries is defined by the following 
terminal statements: 


1. Terminal statement for the full adders described by Boolean functions (4.19)-(4.22) 
except local carries C(35), C(31), C(27), С(23), C(19), С(15), C(11), С(7), CG), and 
C(R) because they are replaced by input group carries IG(I),..., and IG(10), 
respectively. 


2. Terminal statements for the group carries described by Boolean functions (4.45), 
(4.47)-(4.54), and for the group propagates by functions (4.55) and (4.56). 


144 


Сһар.4 А FIXED-POINT ARITHMETIC UNIT 


3. Terminal statement for the input group carries described by Boolean functions (4.23)- 
(4.32) except input group carries IG(1), IG(4), IG(7), and IG(10) which are replaced 
by input section carries 15(1), ..., IS(4), respectively. 


4. Terminal statement for the section carries described by Boolean functions (4.57)- 
(4.59) and for the section propagates described by functions (4.60)-(4.62) and for the 
input section carries described by functions (4.33)- (4.36). 


These terminal statements are now presented below. 


Comment, Inputs to the parallel adder (4.37) 

Terminal, ADAC(R,Q,1-35), 
ADSR(R,Q,1-35), 
C(35), 

Comment, Sum terminals 

Terminal, ADD(R)=CX(R)@IS(4), 
ADD(Q,1-2)=CX(Q, 1-2 ©OC(Q, 1-2), 
ADD(3)=CX(G3)@IG(9), 
ADD(4-6)=CX(4-6)@C(4-6) 
ADD(7)=CX(7)@IG(8), 
ADD($8-10)— CX(8-10)(2C(8-10), 
ADD(11)—CXY(11)()IS(3), 
ADD(12-14)— CX(12-14)(2C(12-14), 
ADD(15)—CX(15)(9IG(6), 
ADD(16-18)— CX(16-18)(2C(16-18), 
ADD(19)— CX(19)(OIG(5), 
ADD(20-22)=CX(20-22)@C(20-22), 
ADD(23)=CX(23)@IS(2), 
ADD(24-26) = CX(24-26)@)C(24-26), 
ADD(27)=CX(27)@IG(3), 
ADD(28-30)=CX(28-30)@)C(28-30), 
ADD(31)=CX(31I)@IG(2), 
ADD(32-34)=CX(32-34)@C(32-34), 
ADD(35)=CX(35)@IS(1), 

Comment, Local carries and propagates 

Terminal, C(Q,1-2)=P(1-3)*(CX(1-3)+C(1-3)) 
C(3)—IG(9), 
C(4-6) = P(5-7)«(CX(5-7) + C(5-7)), 


бес. 4.4 A Parallel Adder with Group and Section Carries 


Comment, 


Terminal, 


Comment, 


Terminal, 


C(7)=1G(8), 

C(8-10) — P(9-11)«(CX(9-11)4- C(9-11)), 
C(11)—IS(3), 
С(12-14)--Р(13-15)(СХ(13-15))--С(13-15)), 
С(15)--ІС(6), 
С(16-18)--Р(17-19)(СХ(17-19)--С(17-19)), 
С(19)--ІС(5), 

C(20-22) — P(21-23)«(CX(21-23) 4- C(21-23)), 
C(23)—IS(2), 

C(24-26) — P(25-27)«(CX (25-27) 4- C(25-27)), 
С(27)--ІС(3), 

С(28-30)--Р(29-3 1)*(CX (29-31) + C(29-31)), 
С(31)--16(2), 

C(32-34) — P(33-35)«(CX(33-35) 4- C(33-35)), 
Р(О,1-35)-- ADSR(Q,1-35) + ADAC(Q, 1-35), 
CX(R,Q, 1-35) 2 ADSR(R,Q,1-35) OADAC(R,Q,1-35), 
Input group carries and input section carries 
IS(1)— C(35), 

IG(2)=G(1)+I1S(1)*PG(1), 
1G(3)=G(2)+1G(2)*PG(2), 
IS(2)=S(1)+ISC1)*PS(1), 

IG(5)=G(4) +IS(2)*PG(4), 

IG(6)— G(5)2-IG(5)«PG(5), 

IS(3)—S(2) J-ISQ)«PS(2), 
IG(8)=G(7)+1S(3)*PG(7), 

IG(9) = G(8) -IG(8)«PG(8), 

IS(4)—S(3) -IS(3)« PS(3), 

Group propagates 
РС(1)=Р(32)*Р(33)*Р(34)*Р(35), 
PG(2)—P(28)*P(29)«P(30)«P(31), 
РС(3)--Р(24)«Р(25)«Р(26)«Р(27), 
PG(4)=P(20)*P(21)*P(22)*P(23), 
PG(5)=P(16)*P(17)*P(18)*P(19), 
PG(6)—P(12)«P(13)«P(14)«P(15), 


145 


146 


Comment, 


Terminal, 


Comment, 


Terminal, 


Comment, 


Terminal, 


Chap.4 A FIXED-POINT ARITHMETIC UNIT 


РС(7)=Р(8)*Р(9)«Р(10)*Р(11), 
PG(8)—P(4)«P(5)«P(6)«P(?), 
PG(9)=P(Q)*P(1)*P(2)*P(3), 
Section propagates 
PS(1)=PG(1)*PG(2)*PG(3), 
PS(2)=PG(4)*PG(5)*PG(6), 
PS(3)=PG(7)*PG(8)*PG(9), 
Section carries 
S(1)=G(3)-+ G(2)*PG(2)*PG(3)+G(1)*«PG(1)*PG(2)*PG(3), 
S(2)=G(6)+ G(5)*PG(5)*PG(6)-+ G(4)*PG(4)*PG(5)*PG(6), 
S3)=G(9)+ G(8)*PG(8)*PG(9)+ G(7)*PG(7)*PG(8)*PG(9), 
Group carries 
G(1)=P(B2)*(CX(32)+ P(33)) 
*(CX(32)+CX(33)+ P(34)) 
*(CX(32)-+ CX(33)4- CX(34)+ P(35)) 
«(СХ(32)--СХ(33)--СХ(34)--СХ(35)), 
GQ)—P(28)«(CX(28)«P(29)) 
*(CX(28)-+ CX(29)+ P(30)) 
*(CX(28)+ CX(29)+ CX(30)+ P(31)) 
«(СХ(28)--СХ(29)--СХ(30)--СХ(31)), 
G(3)=P(24)*(CX(24)+P(25)) 
*(CX(24)-+ CX(25)-+ P(26)) 
*(CX(24)-+ CX(25)+-CX(26)+ P(27)) 
*(CX(24)+ CX(25)4- CX(26)+ CX(27)), 
G(4)=P(20)*(CX(20)+ P(21)) 
«(СХ(20)--СХ(21)--Р(22)) 
«(СХ(20)--СХ(21)--СХХ22)--Р(23)) 
«(СХ(20)--СХ(21)--СХ(22)--СХ(23)), 
G(5)=P(16)*(CX(16)+ P(17)) 
«(СХ(16--СХ(17)--Р(18)) 
«(СХ(16)--СХ(17)--СХ(18)--Р(19)) 
«(CX(16)4-CX(17)2- CX(18)-- CX(19)), 
G(6)—P(12)«(CX(12)4- P(13)) 
«(СХ(12)--СХ(13)--Р(14)) 
*(CX(12)4- CX(13)4- CX(14)4- P(15)) 


Sec. 4.4 A Parallel Adder with Group and Section Carries 147 


«(СХ(12)--СХ(13)--СХ(14)--СХ(15)), 
О(7)--Р(8) «(СХ(8)--Р(9)) 
«(СХ(8)--СХ(9)--Р(10)) 
«(СХ(8)--СХ(9)--СХ(10)- Р(11)) 
*(CX(8)+CX(9)+CX(10)+CX(11)), 
G(8)=P(4) «(СХ(4)--Р(5)) 
«(СХ(4)--СХ(5)--Р(6)) 
ж(СХ(4)--СХ(5)--СХ(6)--Р(7)) 
«(СХ(4)--СХ(5)--СХ(6)--СХ(7)), 
С(9)=Р(О) СХ(О)--Р(1)) 
«(СХ(О)--СХ(1)--Р(2)) 
*(CX(Q)+CX(1)+CX(2)+ P(3)) 
«(СХ(О)--СХ(1)--СХ(2)--СХ(3)), 


4.4.4 Group Carries 


Boolean functions for group carries G(i), group propagates PG(i), and input 
group carries IG(i) are now derived. The first group consists of the 35th, 34th, 33rd, 
and 32nd stages. We have from (4.22): 


C(34) - P(35)«(CX(35) --C(35)), (4.38) 
C(33)=P(34)*(CX(34)-+C(34)), (4.39) 
C(32)=P(33)*(CX(33)-+C(33)), (4.40) 
C(31)=P(32)*(CX(32)+C(32)), (4.41) 


A group carry is generated without the use of local carries within the group and with- 
out the input group carry. Therefore, group carry G(1) can be obtained from the 
Boolean equations (4.38)-(4.41) by eliminating the local carries С(32)-С(35) and by 
considering input carry C(35) to zero. Thus, we have from (4.38): 


C(34) — P(35)«CX(35) (4.42) 
By substituting C(34) in (4.42) into (4.39), we have, after simplification, 

C(33) - P(34)«(CX (34) + P(35)«(CX(34) + CX(35)) (4.43) 
By substituting C(33) in (4.43) into (4.40), we have, after simplification, 


C(32) =P(33)*(CX(33)-+P(34) 
«(P(35)4- CX (33) - CX (34) (4.44) 
«(СХ(33)--СХ(34)--СХ(35)) 


148 Сһар.4 А FIXED-POINT ARITHMETIC UNIT 


By substituting C(32) in (4.44) into (4.41), the resulting carry C(31) becomes group 
carry G(1) of the first group. Thus, we have 
G(1)—P(32)«(CX(32) J- P(33)) 
x(CX(32) 4- CX(33) 4- P(34)) 
*(CX(32)4- CX(33) J- CX(34) -- P(35)) 
*x(CX(32)4- CX(33) 4- CX(34) J- CX(35)) 


(4.45) 


The second group consists of the 31st, 30th, 29th, and 28th stages. From (4.22) 
we have: 

C(30) 2 P(31)«(CX(31)-- C(31)), 

C(29) — P(30)«(CX(30) 4- C(30)), 

C(28) — P29)«(CX(29) + C(29)), 

C(27) - P28)«(CX(28)-- C(28)), 


(4.46) 


The input carry C(31) (4.46) is again ignored. After substitution and simplifica- 
tion of these functions, the resulting carry C(27) becomes group carry G(2) of the 
second group. Thus, we have; 


G(2)=P(28)*(CX(28)-+P(29)) 
«(CX(28) -- CX(29) 4- P(30)) 
«(CX(28)-- CX(29) 4- CX(30) J-P(31)) 
«(CX(28)-- CX(29)- CX(30)-- CX(31)), 


(4.47) 


Boolean functions for the other group carries can be similarly derived. They are, 


G(3)=P(24)*(CX(24)+ P25)) 
«(CX(24)4- CX(25) J- P26)) 
«(CX (24)-+ CX(25)+ CX(26)+ P(27)) 
«(СХ(24) + CX(25) + CX(26)+ CX(27)), 

G(4) - PQ0)«(CX(20) J- P21)) 
«(СХ(20)--СХ(21)--Р(22)) 
«(СХ(20)--СХ(21)--СХ(22)--Р(23) 
«(СХ(20)--СХ(21):-СХ(22)--СХ(23)), 

G(5)=P(16)*(CX(16)+ P(17)) 
«(CX(16)-+CX(17)+P(18)) 
«(CX(16)-- CX(17) 4- CX(18)-- P(19)) 
«(СХ(16)4-СХ(17)--СХ(18)--СХ(19)), 


(4.48) 


(4.49) 


(4.50) 


Sec. 4.4 A Parallel Adder with Group and Section Carries 


G(6)=P(12)«(CX(12)+-P(13)) 
*(CX(12)+CX(13)-+P(14)) 
«(СХ(12)--СХ(13)--СХ(14)--Р(15)) 
«(СХ(12)--СХ(13)--СХ(14)--СХ(15)), 

G(7) - P(8)«(CX(8) 4-P(9)) 
«(СХ(8)--СХ(9)--Р(10)) 
«(СХ(8)--СХ(9)--СХ(10)--Р(11)) 
«(СХ(8)--СХ(9)--СХ(10)4-СХ(11)), 

G(8)=P(4)*(CX(4)+P(5)) 
«(СХ(4)--СХ(5)--Р(6)) 
*(CX(3)-+CX(4)+CX(5)-+P(7)) 
«(CX(4)-- CX(5) J-CX(6)4-CX(7)), 

G(9)  P(Q«(CX(Q)4-P(1)) 
«(СХ(О)--СХ(1)--Р(2)) 
*(CX(Q)+CX(1)-+CX(2)-+PQ)) 
*(CX(Q)+CX(1) + CX(2)+CX(3)), 


149 


(4.51) 


(4.52) 


(4.53) 


(4.54) 


When group propagate PG(i) is 1, it represents the condition that the ith group 
propagates the ith input group carry IG(i) when the latter occurs. Thus, PG(i)«IG(i) 
represents input group carry IG(i) which has propagated through the ith group. 
РС(1) is 1 if carry propagates Р(35), Р(34), P(33), and P(32) for the four stages of the 


first group are all 1. Thus, we have: 
PG(1)=P(32)*«P(33)*P(34)*P(35) 
Similarly, group propagates of the other groups are, 


PG(2)=P(28)«P(29)*P(30)*P(31) 
PG(3)— PQ4)«PQ5)«P(26)«P(27) 
PG(4)—P(20)«P21)«P(22)«P(23) 
PG(5)=P(16)*P(17)*P(18)*P(19) 
PG(6)=P(12)#P(13)*P(14)*P(15) 
PG(7)=P(8)+*P(9)*P(10)*P(11) 
PG(8)=P(4)*P(5)*P(6)*P(7) 
PG(9)=P(Q)*P(1)*P(2)*P(3) 


(4.55) 


(4.56) 


Input group carries are generated by the group logic network as indicated by 
Fig. 4.16. Input group carry for group 1, ІС(1), is input carry C(35) for the 35th 


150 Сһар.4 А FIXED-POINT ARITHMETIC UNIT 


stage of full adder. Thus, we have: 
ІС(І)--С(35) 


Input group carry for group 2, IG(2), can be the carry generated within the first 
group G(1) or the input group carry IG(1) propagated through group 1. Thus, we 
have: | 


IG(2)=G(1)-+IG(1)+PG(1) 


Other input group carries are similarly derived and have been shown in (4.25)-(4.32). 

The level for each of the group carries G(i) can be evaluated from (4.45) and 
(4.47)-(4.54); it is 4. The level for each of the group propagates PG(1) can be 
evaluated from (4.55) and (4.56); it is 2. Therefore, the levels for the input group 
carries IG(1), IG(2),..., IG(10) сап be evaluated from (4.23)-(4.32); they аге 0, 
4,..., 20, respectively. These input group carries, together with the levels that they 
may have propagated through, are shown in the third and fourth columns of Table 
4.]. 


4.4.5 Section Carries 


Boolean functions for section carries S(i), section propagates PS(i), and input 
section carries 15(1) are now derived. The first section consists of groups 1, 2, and 3. 
Section carry $(1) is generated within the section, with the input section carry 
ignored. Thus, S(1) can be a group carry from the third group G(3), or a group carry 
from the second group G(2) which has propagated through the third group, or a 
group carry from the first group G(1) which has propagated through the first and 
second groups. Thus, the Boolean function for the first section is: 


S(1)=G(3)-+G(2)*PG(2)*PG(3) + G(1)*PG(1)*PG(2)*PG(3) (4.57) 
Boolean functions for the other section carries can be similarly derived. They are: 

S(2)= G(6) + G(5)*PG(5)*PG(6)-+ G(4)«PG(4)«PG(5)«PG(6) (4.58) 
and 

S(3)=G(9)+G(8)*PG(8)*PG(9) + G(7)*PG(7)*PG(8)*PG(9) (4.59) 


When section propagate PS(i) is 1, it represents the condition that the ith section 
propagates the ith input section carry IS(i) when the latter occurs. Thus, PS(i)*IS(i) 
represents input section carry 15(1) which has propagated through the ith section. 
PS(i) is 1 if group propagates PG(1), PG(2), and PG(3) of thé first section are all 1. 
Thus, we have: 


PS(1)=PG(1)*PG(2)*PG(3) (4.60) 


Sec. 4.5 Design Considerations 151 


Similarly, section propagates of the other sections are, 


PS(2)=PG(4)*PG(5)*PG(6) (4.61) 
PG(3)=PG(7)*PG(8)*PG(9) (4.62) 


Input section carries are those generated by the section logic network as indi- 
cated in Fig. 4.17. Input section carry for section 1, IS(1), is input carry C(35) for the 
thirty-fifth stage of full adder. Thus, we have: 


IS(1)=C(35) 


Input section carry for section 2, IS(2), consists of the carry generated within the first 
section S(1) or the input section carry IS(1) propagated through section 1. Thus, we 
have: 


15(2)--5(1):-15(1)Р5(1) 


Other input section carries similarly derived аге shown іп (4.35) and (4.36). 

The level for each of the section carries S(i) can be evaluated from (4.57), (4.58), 
and (4.59); itis 6. The level for each of the section propagates can be evaluated from 
(4.60), (4.61), and (4.62); it is 3. The levels for the input section carries 15(1),..., 
IS(4) can be evaluated from (4.33)-(4.36); they are 0, 6, 8, and 10. These input section 
carries which replace IG(1), IG(4), IG(7), and IG(10), together with the levels that 
they may have propagated through, are shown in the fifth and the sixth columns of 
Table 4.1. The levels for the sum bits can be evaluated from (4.21); they are shown 
in the seventh and eighth columns of Table 4.1. 


4.5 Design Considerations 


Two design considerations are discussed in this section: the use of the double- 
rank register, and the need for other arithmetic instructions to supplement the addi- 
tion, subtraction, multiplication, and division instructions. 


4.5.1 Double-rank Registers 


There are many register transfers in the sequences described by statements (4.11), 
(4.12), and (4.13). These transfers are between registers whose flipflops may have 
delays either at their inputs or at their outputs. In practice, these delays limit the 
speed of register transfer and, in turn, impose the minimum clock period. One way to 
improve the speed of register transfer is to use double-rank registers. 

A double-rank register consists of two identical registers functioning as one. 
For example, let register AC be replaced by a double-rank register called registers 
АС! and AC2. The number is normally stored in register ACI, though transiently 


152 Chap. 4 А FIXED-POINT ARITHMETIC UNIT 


stored in register AC2 first. Now, consider the following execution statement taken 
from description (4.11): 


JSI«ADS«K(7)«P() AC(R,Q,M)<—0-AC(Q,M) add2 0-0-SR(M)-0 (4.63) 


With a double-rank register, the sum is first stored in register AC2 and then trans- 
ferred to register АСІ. To describe the transfers precisely, we have: 


Register, ACI(S,R,Q,1-35), AC2(S,R,W, 1-35) (4.64) 
Clock, Р(1-3) 

/SIxADS*K(7)*P(1)/ АСХЖХО,О,М)-0-АС(ЦО,М) add2 0-0-SR(M)-0, 
/[SI«XADS«K(7)«P(2/ ACI(R,Q,M)-——AC2(R,Q,M), 

/P(3)/ D<countup D 


In the above description, a three-phase clock is employed. During the first phase, 
the addition is performed. During the second phase, the word in register AC2 is 
transferred to register ACI. These two transfers can be made as fast as the compo- 
nents allow, without the need of delays at the inputs or outputs of the registers. 
During the third phase, the timing register D is incremented. 

Instead of using the three-phase clock, one may retain the use of a two-phase 
clock in conjunction with delays. Let the delays be 


Delay, DE (4.65) 


The amount of the delay can be equal to or less than the delay between clock phases 
Р(1) and Р(2). With the delay element, description (4.64) can be revised as follows: 


Register, ACI(S,R,Q,1-35), AC2(S,R,Q, 1-35), (4.66) 
Clock, Р(1-2) 
/SItADS*K(7)*P(1)/ AC2(R,Q,M)<0-AC1(Q,M) add2 0-0-SR 
(M)-0, 
/DE(SIXADS«K(7)«P(1)/ AC1(R,Q,M)<AC2(R,Q,M), 
/P(2)/ D —countup D, 


The above control signal for the add2 micro-operation is delayed and then used as 
the control signal for the transfer from register AC2 to register ACI. Therefore, the 
transfer micro-operation does not interfere with the incrementing-counter micro- 
operation at clock phase P(2). 


4.5.2 Fixed-point Arithmetic Instructions 


An instruction with an op-code and an operand address is normally required 
to order the arithmetic unit to carry out the addition, subtraction, multiplication, 


Sec. 4.5 Design Considerations 153 


or division sequence. Assume that the instruction consists of only an op-code and an 
operand address. Then the instructions are ADD y, SUB y, MPY y, and DIV y, 
where y specifies the address of the memory at which the addend, subtrahend, multi- 
plicand, or dividend are stored, respectively. However, these instructions alone are 
not sufficient, and additional instructions are required. 

To add two numbers in the memory requires two instructions, CLA (clear and 
add) and ADD. The CLA instruction transfers the augend in the memory to the 
accumulator, and the ADD instruction adds the addend in the memory to the accumu- 
lator. If the result is to be stored into the memory, the STO (store) instruction is also 
required. To subtract one number from another which are both in the memory, and 
to store the result into the memory require three instructions, namely, CLA y, SUB 
y+1, STO y+2. 

Three additional instructions which can be conveniently implemented and may 
sometimes be useful are: CLS (clear and subtract), ADM (add magnitude), and SBM 
(subtract magnitude). The CLS instruction is identical to the CLA instruction except 
that the sign of the augend is first changed. The ADM and SBM instructions are 
identical to the ADD and SUB instructions, respectively, except that the sign of the 
number in the memory is first set as positive. 

To multiply one number by another, both of which are in the memory, requires 
two instructions, LDQ (load MQ) and MPY. The LDQ instruction loads the multiplier 
to the MQ register. The MPY instruction clears the accumulator, transfers the multi- 
plicand to the SR register, and initiates the multiplication sequence. The product is 
in the AC and SR registers with the most significant part in the accumulator and the 
least significant part in the MQ register. If both parts are to be stored into the memory, 
two instructions are required, STO y and STQ y+1 (store MQ). The STQ instruction 
stores the contents of the MQ register into the memory location addressed by y+ 1. 

Shift instructions are useful in scaling by moving the binary point of the number 
in register AC or in casregister AC(M)-MQ(M). Instructions ALS (accumulator left 
shift) and ARS (accumulator right shift) shift the contents of subregister AC(M) to 
the left or right, respectively, the number of bit positions specified by the address. 
For these instructions, the address represents a shift count. Instructions LLS (long 
left shift) and LRS (long right shift) shift the contents of casregister AC(M)-MQ(M) 
to the left or right, respectively, the number of bit positions specified by the address. 
If the double-length product is not needed, instruction RND (round) can be provided 
to round off the most significant part of the product. The roundoff may use the 
simple rule that the AC(M) is incremented by one if bit MQ(1) contains а 1. Instead 
of instruction RND, instruction MPR (multiply and round) may be provided. This 
instruction is identical to the MYP instruction except that the most significant part 
in the AC(M) is rounded after the multiplication. 

The DIV instruction divides the dividend whose magnitude is the 70 bits in 
casregister AC(M)-MQ(M). Therefore, before the division, registers AC and MQ 
should be loaded by CLA and ГОО instructions and, if necessary, scaled by the 
shift instructions. After the division, the quotient in the MQ register can be stored 
by a STQ instruction and the remainder in the AC register by a STO instruction. If 
division overflow occurs, indicator DVOV is set to | and the computer operation is 


154 Chap.4 А FIXED-POINT ARITHMETIC UNIT 


stopped. However, it is wasteful to stop computer operation. Therefore, the division 
can be directed by two instructions, DVH (divide or halt) and DVP (divide or pro- 
ceed). The first one stops the computer operation when division overflow occurs, 
while the second one does not. 


4.6 Microprogramming the Arithmetic Unit 


The parallel, binary, fixed-point arithmetic unit has been described by the con- 
figuration in statements (4.1)-(4.4), and by the sequence charts in Figures 4.6-4.9, 
4.11, and 4.12. The arithmetic unit thus described is implemented by the sequential 
logic control. The configuration for sequential logic control was described in state- 
ment (4.10), and the arithmetic sequences were described in statements (4.11)-(4.13). 

In this section, the use of microprogram control for the arithmetic unit is to be 
described. The control configuration, the control signals, the control word format, 
the sequence descriptions, and the microprogram will be presented. 


4.6.1 Microprogram Control Configuration 


The configuration for microprogram control is shown in the block diagram in 
Fig. 4.18. Control memory CM has a capacity of 256 36-bit words with address 
register H and buffer register F. The two bits of register BR perform subsequencing 
control. The four-phase clock Р(0-3), the single-bit register D, and the 6-bit register 
MC generate the control signals. Register OPCODE stores the op-code of the instruc- 
tion. Main memory M has a capacity of 32,768 36-bit words with address register 
AD and buffer register SR. Register K stores an operand address. This configuration 
is now described by the following statements: 


Comment, microprogram control configuration (4.67) 
Register, H(0-7), $control memory address 
register 
F(0—35), $control word register 
М.С(0-5), $main-memory-cycle 
sequencing register 
D, $main-memory-cycle wait 
register 
OPCODE(0-3), $op-code register 
К(0-14), Форегапа address register 
AD(0-14), $main memory address register 


SR(0-35), $main memory buffer register 


Control memory 
СМ(0-255, 1-50) 


AD(0-14) К(0-14) 


Main memory 
М(0-32767, 0-35) 


Fig. 4.18 Microprogram control configuration for the parallel, 
binary, fixed-point arithmetic unit 


155 


156 Chap.4 А FIXED-POINT ARITHMETIC UNIT 


Subregister, F(ADS)— F(0-7), $address field of register F 

Memory, CM(H)=CM(0-255,0-35), $control memory 
M(AD)= M(0-32767,0-35), $main memory 

Clock, Р(0-3), $four-phase clock 


Block, | DSET(IF (MC(0)--MC(1)--MCQ)--MC(3)4- MC(4)-1) THEN 
(D—1), IF(MC(5)=1) THEN (D<0)), 
MQSET(IF(MQ(34,35)=00) THEN (MQ(34,35)<01,C—0), 
IF(MQ(34,35)=01) THEN (MQ(44,35).—10,C-—0), 
IF(MQ(34,35)=10) THEN (MQ(34,35)<11), 
IF(MQ(34,35)=11) THEN (MQ(24,35)-—00)), 
MQBRANCH(IF (MQ(34,35)—10) THEN (AC(R,Q,M)<-AC(R,Q,M) 
add2 0-0-SR(M)-0), 
IF (MQ(34,35)= 10) THEN (АС(В,О,М)«-АС(Е,0,М) 
add2 0-SR(M)-0-0), 
IF (MQ(34,35)=11) THEN (AC(R,Q,M)<-0-0-AC(M) 
add2 1-I-SR(M)’-1, С<-1)), 


In the above configuration, the block statement declares the names of three 
groups of micro-statements: DSET, MQSET, and MQBRANCH. Block DSET 
represents a conditional micro-operation for setting or resetting register D. Block 
МОЗЕТ represents the group of micro-operations which make the correction required 
in the multiplication sequence. Block MQBRANCH represents the group of micro- 
operations which test bits MQ(34, 35), but then perform no addition, or addition 
or subtraction of the multiplicand as described previously in the multiplication 
sequence. The micro-operations represented by blocks MQSET and MQBRANCH 
have been shown in the sequence chart in Fig. 4.9. 


4.6.2 Timing and Control Signals 


Each main memory cycle is chosen to coincide with six control memory cycles, 
and each control memory cycle to coincide with one clock cycle. Therefore, there 
are four steps in each control memory cycle and 24 steps in each main memory 
cycle. The control signals for these 24 steps are described by the following sequence 
of labels (in addition to the labels for starting): 


Comment, control signals during one main memory cycle (4.68) 
/START(ON)*P(3)/ MC-—8 
/MC(0)«P(0)/ $beginning of both memory cycles 


Sec. 4.6 Microprogramming the Arithmetic Unit 157 


/MC(0)«P(1)/ 

/MC(0)«P(2)/ 

/P(3)/ MC-—cir MC $епа of a control memory cycle 
/МС(1)*Р(0)/ $beginning of a control memory cycle 
/MC(1)*P(1)/ 

/MC(1)*P(2)/ 

/P(3)/ МС<сіг MC $end of a control memory cycle 
/MC(5)*P(0)/ $beginning of a control memory cycle 
/МС(5)*Р(1)/ 

/MC(S)«P(2)/ 

/P(3)/ MC-—cir MC $end of both memory cycles 


In the above labels, the four steps in each control memory cycle are sequenced by 
the four phases of clock Р(0-3), and the four control memory cycles in each main 
memory cycle are sequenced by the six states of ring counter MC(0-5) which is 
circularly rightshifted during the last clock phase P(3) of each control memory cycle. 

During each main memory cycle, a word is either read out of or written into 
main memory M. Assume that the transfer of the operand address in register K to 
the main memory address register AD and the initiation of the main memory read 
or write operation occur during the second step. For a read operation, the word is 
available at the storage register SR of the main memory during the 14th step. Thus, 
we have: 


/MC(0)*P(1)/ C-K, 
/MCQP(D/ R—M(O, 


(4.69) 


The control signals for the four steps in every control memory cycle are described 
by the following labels: 

Comment, control signals during one control memory cycle 

[P(0)«D'/ Sbeginning of a control memory cycle 

[P(D)«D'/ (4.70) 

[Р(2)*0”/ 

/P(3)*D’/ $end of a control memory cycle 
When register D contains a 0, the above labels appear; otherwise, they disappear. 


Thus, register D controls the advance or halt of the steps in a control memory cycle. 
During each control memory cycle, a control word is read out of the control 


158 Chap.4 А FIXED-POINT ARITHMETIC UNIT 


memory. Assume that the transfer of the control memory address to register H and 
the initiation of the control memory read both occur during clock phase P(3) of the 
preceeding control memory cycle, and the control word becomes available at buffer 
register Е during clock phase Р(0) of the current memory cycle, or 


/P(3)*D’/ H<—countup H, , $end of the preceding control memory cycle (4.71) 


/P(0)*D’/ F—MC(H), | $beginning of the current control memory 
cycle 


/P(1)*D‘/ 
/P(2)*D‘/ 


Micro-operations activated by the control bits in register F are executed during 
clock phases Р(1-3) of the current control memory cycle. 


4.6.3 Control Word Format 


Table 4.2 shows the format of the control word. The 36 bits of the control word 
in register F are divided into three groups: field Е(0-7) which contains a control 
memory address, field Е(8-27) which contains the control bits, and field Е(28-35) 
which is not used. Field Е(0-7) provides a branching address for each control word. 
Each bit in Field Е(8-27) is assigned to control one or more micro-operations that 
can happen as one step or as one sequence. These are the micro-operations that are 
in the charts for the arithmetic sequences. The clock phase assigned to each control 
bit is also shown in Table 4.2. This assignment is to be shown in the next section. 


4.6.4 Microprogramming the Arithmetic Sequences 


With the control word format in Table 4.2, the previously described control 
signals (4.68) and (4.70) are now assigned to the micro-operations in the sequence 
charts for the arithmetic sequences. This assignment consists of three parts. The 
first part is the assignment of the control signals for initialization and for generating 
the timing and control signals; this part does not contribute to the microprogram. 
The second part assigns the micro-operations in the sequence charts into one or 
more micro-instructions (or control words), and the third part assigns the control 
signal to each of the micro-operations in each micro-instruction. It is the result of 
these two parts that yields the microprogram. 

For the description of the arithmetic unit here, the first part assigns the control 
signals to three groups of micro-operations. The first group generates timing signals 
for the six control memory cycles in each main memory cycle. The second group 
initializes the fetch of the desired micro-instruction from.the.control memory. The 
third group places the desired address into the address register of the control memory. 
It is assumed that the first control memory address for the add-subtract sequence, 
the multiplication sequence, and the division sequence Бе 0, 6, and 12, respectively. 


рэзп јои 
‘H dnjunoo—> H 
‘(aww us 
‘хаж 
ANOV—(W)OV 
“0-(ОЯ65-0-0 ZPPP (ADO V-0-0— (AO WOV ‘I (SE)OW) NAHL (14(©)Z) JI 
‘0-(S€-DOW-(DOWHSE-Z)OV— (WOW-(WDOV 
“(Se OS ‘“(S\IOVO(S)US—-(SNOW) NAHL (I— (O8) AI 
‘(I> (2a ASTA (I (Dud I AOA) NAHL (14()Z) AI (QW0)o0V-0-0— (NO WIV 
'(SOV—(S)ON 
«ФОМ-(66-ООУ-“ИДОУ 
‘(H dnjunos—H) NAHL (0-29) AI 
“ус upjunoo— 2S 
*aDov-GDov(Q'0)v '(ce-T'S9o0NW-ON'O)V-ON-(st-oOV 
'HONV3IHONW OG 
(LASON OQ) NAHL (1-2) JI 
‘(14Sa оа) мані (I (Do18) JI 
'KXTA(D318'L—(S)oV'0—0W) ASTA 
(0(D38'0—(SOW'0—2'81—2$) МЯНІ ((07GQNDOWD* (Q9 QAD31S)) AI 
'0— (QqN'O*3DOV'G—(SON'T—(S)oOV) 4614 (0—(S)OW'0—(S)OV) ману (S)ON = (5)Ҹ5) JI 
(CWO V (WO VY) 9579 © OV) NAHL (L^ * LE-0NDOV) HD ASTA 
C(S)OV—(S)OV‘0—(O)OV) NAHL CI—(D318) JI 
(0 (D:18) ASTA (1 (D218'07-0A0)O0 V-0 CPP? TI (AO WOV) NAHL (Т--(О)ОУ) AI 
‘IASA оа 
(Зауяр—н 
“0--ООУ) NAHL (0=(ИГОЭУ) AI «І-АОСУ) NAHL (1—(0)OV) ЧІ 
*0 "ud 
“(н dnjunoo—> H) 9579 ((SQV)HH) NAHL (1 =(1) ea) HI 
%-(АУЯ6-0-0 CPP? (A'O)OV-0—(W'O^?DOV 
“0--(Гяя) ASTA G (Dua GDov—QDOoV) NAHL (SOV #(SYAS) JI 
*9—(o'*DOV 'C(S)us—(S)us) NAHL (1-300040) AI 
‘(HOA 
‘ssoippe® Алошәш јоциод 


SNOLLV3I3dO-O3Ol]A 


1Pui104 DIOAA !OI13u00^ 7'ъ запах 


(є)4* (OON 
(Da*(G)OIN 
(04*(0)2И 
(Dd 
(Od 
(Dd 
(Od 
(Dd 
(Od 
(Dd 
(Od 
(Dd 
(€)d 
(Od 
(Dd 
(€)d 


(да 
(Ра 


(да 
(Ра 
(ба 
(Od 
(Dd 
(ба 
(да 
(ба 
(Od 
(Dd 
(да 


aSWHd 32077) 


(S€-87) A 
(О 
(904 
(94 
(СОЯ 
(ФОН 
(ФО 
(СО 
(ОЕ 
со 
(tA 
(ТОН 
(004 
(004 
(6D4 
(6D4 
(804 


(LDA 
(ODA 


(SDA 
(SDA 
(PDA 
(ЕП 
(“ТА 
(IDA 
(ODA 
(6): 
(351 
(64 
(ЗН 
(4-04 


Lig TOULNOQ 


159 


160 Chap.4 А FIXED-POINT ARITHMETIC UNIT 


These micro-operations and the assigned control signals are shown below: 


Comment, microprogramming the parallel, binary, arithmetic unit (4.72) 
/[START(ON)«P(2) МС+40,, F<, 
[Р(3)/ МС<сг MC, 
[Р(3)/ D-0, F(8)—1, 
/P(3)/ ТЕ (OPCODE=0+1) THEN (Н<-0), 
IF (OPCODE=2) THEN (Н<6), 
IF (OPCODE=3) THEN (H —12), 


In the statements above, the op-codes for addition, subtraction, multiplication, and 
division are assumed to be 0, 1, 2, and 3, respectively. The assignments of the control 
signals in the second and third parts are now shown for each of the three sequences. 

For the add-subtract sequence, there are four micro-instructions: ASI, AS2, AS3, 
and AS4. Micro-instruction ASI transfers the operand address in register K to address 
register AD and reads the operand out of the main memory M. The micro-operations 
in the sequence charts of Figs. 4.6 and 4.7 are assigned to three micro-instructions: 
the initialization part for micro-instruction AS2, the add part for micro-instruction 
AS3, and the subtract part for micro-instruction AS4. The assignment of the control 
signals for the micro-operations in these sequence charts is shown below: 


Comment, microprogramming the add-subtract sequence 

Comment, micro-instruction AS] (4.73) 
/D’*P(0)*F(8)/ Е<СМ(Н), 

/MC(0)*P(1)*F(26)/ АРК, 

/MC(2)*P(1)*F(26)/ SR<M(AD), 

/MCQ)«P(3)«F27) H-«countup H, 


Comment, micro-instruction AS2 (4.74) 

[D'«P(0)«F(8)/ F-—CM(H), 

/D'xP(1)«F(9)/ IF (OPCODE-—1) THEN (SR(S) —SR(S)), AC(R,Q)<0, 

/[D'«P(2)«F(9)/ IF (SR(S)z:AC(S)) THEN (AC(M)-—AC(M),BR(1)—1) 
ELSE(BR(1)-—0), 

/[D'«P(3)«F(9)/ AC(R,Q,M)-—0-AC(Q,M) add2 0-O-SR(M)-0, 


[D'xP(3)«F(10)/ IF (BR(1)=1) THEN (H-——F(ADS)) 

ELSE (H-—countup Н), $F(ADS)=AS4 
[D'«P(3)«F(11)/ BR-—0, 
Comment, micro-instruction AS3 (4.75) 
/D'«P(0)«F(8)/ F-——CM(H), 


Sec. 4.6 Microprogramming the Arithmetic Unit 161 


/D’*P(1)*F(12)/ IF(AC(Q)=1) THEN (ADOV-—1), 
[D'«P(1)«F(12)/ IF(AC(Q,M)—0) THEN (AC(S)—0), 


/D'«P(3)«F(13)/ H<F(ADS), $F(ADS)=exit address 
/D'«P(3)«F(14)/ DO DSET, 
Comment, micro-instruction AS4 (4.76) 
/D'«P(0)«F(8)/ F-—CM(H), 


/D’*P(1)*F(15)/ IF (AC(Q)=1) THEN (АС(В,О,М)<—1 add2 0-AC(Q, 
M)-0, BR(1)<—1) 
ELSE (BR(1)<0), 
/D’*P(2)*F(15)/ IF (BR(1)=1) THEN (AC(Q)<0,AC(S)<-AC(S)’) 
ELSE (IF (AC(M)=37... 7,) 
THEN (АС<-0) 
ELSE (AC(M)<-AC(M)’)), 
/D’*P(3)*F(13)/ H<F(ADS) $F(ADS)=exit address 
/D’*P(3)*F(11)/ BR-—0 
/D'«P(3)«F(14)/ DO DSET 


In the description above, the first micro-operation in micro-instructions AS2, 
AS3, and А54 reads the micro-instructions out of the control memory. Bit BR(1) 
15 used in micro-instruction AS2 in order to branch to the add part or the subtract 
part of the add-subtract sequence. In micro-instruction AS4, bit BR(1) is used to 
branch, depending on whether the difference is negative or positive. In micro-instruc- 
tions AS2 and ASA, register D 1$ set to 1 И the current control memory cycle is not 
control memory cycle MC(5) and reset to 0 if it is. The micro-operations for condi- 
tionally setting and resetting register D are described by micro-statement DO DSET. 
In this way, the add-subtract sequence is terminated at the end of a main memory 
cycle. 

The execution of the add-subtract sequence takes one main memory cycle (1.е., 
six control memory cycles М.С(0)-МС(5)). The micro-operations in micro-instruction 
ASI are executed during control memory cycles MC(0) and MC(2); no micro-opera- 
tion occurs during control memory cycle MC(1). Micro-instructions AS2, AS3, and 
ASA are executed during control memory cycles MC(3), MC(4), and MC(5), respec- 
tively. 

For the multiplication sequence, there are four micro-instructions: МІ, M2, M3, 
and М4. Micro-instruction М1 is identical to that described by statements (4.73). 
The micro-operations in the sequence charts of Figs. 4.8 and 4.9 are assigned to three 
micro-instructions: the initialization part for micro-instruction M2, the multiplica- 
tion part for micro-instruction M3, and the finalization part for micro-instruction 
MA. The assignments of the control signals for the micro-operations in these sequence 
charts are shown below. 


162 


Chap.4 А FIXED-POINT ARITHMETIC UNIT 


Comment, microprogramming the multiplication sequence (4.77) 
Comment, micro-instruction M1 
/D’*P(0)*F(8)/ F-—CM(H), 
[МС(0)*Р(1)*Е(26)/ АРК, 
/MC(2)*P(1)*F(26)/ SR<M(AD), 
/MC(2)*P(3)*F(26)/ H<—countup Н, 
Comment, micro-instruction M2 (4.78) 
[D'«P(0)«F(8)/ F-—CM(H), 
/D'«P(D)«F(16)/ IF (SR(S)- MQ(S)) THEN (AC(S)<—0,MQ(S)<0) 
ELSE (AC(S)<-1,MQ(S)<]), 
AC(R,Q,M)<0, 
[D'xP(2)«F(17)/ IF (SR(M)+0)*(MQ(M)+40)) THEN (SG-—18,C-—0, 
MQ(S)<0,BR(1)<0) 
ELSE (MQ<0,AC(S) 
—0,BR(1)<—1), 

/[D'«P(3)«F(10)/ IF (BR(1)=1) THEN (Н-Е(АОЗ)) ELSE (H<—countup 

H), 

$F(ADS)=exit address 

/[D'«P(3)«F(18)/ IF (BR(1)=1) THEN (DO DSET), 
[D'«P(3)«F(11)/ BR —0, 
Comment, micro-instruction M3 (4.79) 
/D'«P(0)«F(8)/ F<CM(H), 
/D'«P(1)«F(19)/ IF(C=1) THEN (DO MQSET), 
/D’*P(1)*F(20)/ SC<countdn SC, 
/D'«P(2)«F(19)/ DO MQBRANCH, 
[D'«P(3)«F(19)/ AC(2-35)-MQ-—AC(Q,M)-MQ(S,1-33),, AC(Q,1)—— 

AC(R)-AC(R), 
/D’*P(3)*F(21)/ IF ($С=0) THEN (H--countup H), 
Comment, micro-instruction M4 (4.80) 
/D’*P(O)*F(8)/ F-—CM(H), 
[D'«P(1)«F(22)/ AC(M)-—ACQ-35)-MQXS), 
[D'«P(2)«*F(22)/ MQ(S)—AC(S), . 
[D'«P(3)«F(13)/ H<-F(ADS), $F(ADS)=exit address 
/D'«P(3)«F(14) DO DSET, 


Sec. 4.6 Microprogramming the Arithmetic Unit 163 


In the description above, bit BR(1) is used for branching in micro-instruction 
M2. The multiplication loop is iterated 18 times by repeatedly executing micro- 
instruction M3 until counter SC reaches 0. At that time, micro-instruction M4 is 
read out of the control memory and executed. The multiplication can be terminated 
during the execution of micro-instructions or M4. The termination is again accom- 
plished by executing micro-statement DO DSET. 

The multiplication sequence takes four main memory cycles. During the first 
cycle, micro-instructions M1 and M2 are executed and micro-instruction M3 is 
executed twice. M3 is executed six times during the second cycle and six times during 
the third cycle. During the fourth cycle, it is executed four times to complete the 
required 18 iterations of the multiplication loop. Then micro-instruction М4 is 
executed. 

For the division sequence, there are also four micro-instructions: D1, D2, D3, 
and D4. Micro-instruction D1 is identical to that described by statements (4.73). 
The micro-operations in the sequence charts in Figs. 4.11 and 4.12 are assigned to 
three micro-instructions: the initialization part for micro-instruction D2, the divide 
part for micro-instruction D3, and the finalization part for micro-instruction D4. 
The assignments of the control signals for the micro-operations in these sequence 
charts are shown below. 


Comment, microprogramming the division sequence (4.81) 
Comment, micro-instruction D1 
[D'«P(0)«F(8)/ Е<-СМ(Н), 
/MC(0)*P(1)*F(26)/ AD-—K, 
[МС(2)*Р(1)*Е(26)/ SR<M(AD), 
[МС(2)*Р(3)«Е(27)/ H«countup H, 
Comment, micro-instruction D2 (4.82) 
[D'«P(0)«F(8)/ F-—CM(H), 
[D'«P(1)«F(23)/ AC(R,Q,M)-—0-0-AC(M)', 

IF 7(0)-=1) THEN (DVOV —L,BR(1)—1) 

ELSE (BR(2) —1), 
[D'«P(2)«F(23)/ IF (BR(2)=1) THEN (MQ(S)<-SR(S)®AC(S), SC——35), 
[D'«P(3)«F(10)/ IF (BR(1)=1) THEN (H.—F(ADS)) 
ELSE (H-—countup Н), 
$F(ADS)=exit address 

/D’«P(3)*F(18)/ IF (BR(1)—1) THEN (DO DSET), 
[D'«P(3)«F(11)/ BR <-0, 
Comment, micro-instruction D3 (4.83) 
[/D'«P(0)«F(8)/ F—CMC(H), 


164 Chap.4 А FIXED-POINT ARITHMETIC UNIT 


/D'«P(1)«F(20)/ SC<countdn SC, 
/D'«P(1)«F(24)/ AC(M)-MQ(M)-——ACQ-35)-MQ(1)-MQQ-35)-0, 
JD'«P(2)«F(24)/ IF (Z(Q)zz1) THEN (MQG5) —1, 

AC(R,Q,M) —0-0-AC(M) add2 0-0-SR(M)-0), 
[D'«P(3)«F(21)/ IF (SC=0) THEN (H<countup Н), 


Comment, micro-instruction D4, (4.84) 
[D'«P(0)«F(8)/ F-—CM(H), 

/D'«P(1)«F(25)/ AC(M)—AC(M), 

/[D'«P(3)«F(13)/ H<F(ADS), $F(ADS)=exit address 


JD'«P(3)«F(14)/ DO DSET, 


In the description above, bits ВЕ(І, 2) are used for branching in micro-instruc- 
tion D2. The subtraction required for the division is carried out by micro-instruction 
D3. The subtraction is performed 35 times by repeatedly executing micro-instruction 
D3 until counter SC reaches 0. At that time, micro-instruction D4 is read out of the 
control memory and executed. The division can be terminated during the execution of 
micro-instruction D2 or D4. The termination is again accomplished by executing 
micro-statement DO DSET. 

The division sequence takes seven main memory cycles. During the first cycle, 
micro-instructions D1 and D2 are executed and micro-instruction D3 is executed 


TABLE 4.3 Microprogram for the Parallel, Binary Arithmetic Unit 


MicRo- Е 0-7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28-35 
INSTRUCTION H 


ASI 0 01000000.0 000000000 0Q0 ті 
AS2 1 311110000000000000000 
AS3 2 1⁄6 10001110000000000000 
AS4 3 16100101 11 0000000000000 
4 
5 
MI 6 01000000000000000001 1 
M2 7 16 101 10000111000000000 
M3 8 0100000000001 11000000 
M4 9 16 100001 100000001 00000 
10 
11 
ГІ 12 01000000000000000001 1 
D2 1316 101 1000000100001 0000 
D3 14010000000,0 0001710 0100 0 
D4 15 14641000031 100060600000 о 1 0 0 
Exit 16 


Note: All addresses are decimal 


References 165 


twice. D3 is executed six times during each of the second through sixth cycles. During 
the seventh cycle, D3 is executed three times to complete the required 35 subtractions, 
and micro-instruction D4 is then executed. 


4.6.5 Microprogram 


The microprogram for the arithmetic unit is shown in Table 4.3, where there are 
12 micro-instructions. The first four micro-instructions located at addresses 0-3 
are for the add-subtract sequence. The four micro-instructions located at addresses 
6-9 are for the multiplication sequence. The four micro-instructions located at 
addresses 12-15 are for the division sequence. The exit address is chosen to be 16. 

As described in Chapter 3, the 175 in each micro-instruction are determined by 
the control bits in the labels of those execution statements that describe the micro- 
instruction. In this way, the microprogram in Table 4.3 is obtained from statements 
(4.73)-(4.84). 


References 


1. Burks, S. W., GoLDsTINE, H. H., and Vou NEUMANN, J., “Preliminary Discussion of the 
Logical Design of an Electronic Computing Instrument, 1946,” Datamation 8, No. 9, 
pp. 24-31, September, 1962, and 8, No. 10, pp. 36-41, October, 1962. 

2. PHISTER, M., JR., Logical Design of Digital Computers. New York: John Wiley & Sons, 
Inc., 1958. 

3. BLAAUW, G. A., “Indexing and Control-Word Technique," JBM Journal of Research 
and Development 3, No. 3, pp. 288-301, July, 1959. 


4. BECKMANN, F. S., Brooks, F. P., JR., апа LAwrEss W. J., JR., "Developments in the 
Logical Organization of Computer Arithmetic and Control Units," Proceedings IRE, 
49, No. 1, pp. 53-66, January, 1961. 


5. MaácSonrey, О. L., “High-Speed Arithmetic in Binary Computers," Proceedings IRE, 
49, No. 1, pp. 67-101, January, 1961. 

6. Cuu, Y., Digital Computer Design Fundamentals. New York: McGraw-Hill Book Com- 
pany, 1962. 

‚ “IBM 7094 Principles of Operation,” Form A22-6703-4, 5th edition, October, 
21, 1966. 

8. FLoRES, I., Computer Design. Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1967. 

9. GsCHWIND, Н. W., Design of Digital Computers. Springer-Verlag New York, Inc., 1967. 

10. Сно, Y., Introduction to Computer Organization. Englewood Cliffs, N.J.: Prentice-Hall, 
Inc., 1970. 


166 


Chap.4 А FIXED-POINT ARITHMETIC UNIT 


Problems 


4.1. 


4.2. 


4.3. 


4.4. 


4.5. 


4.6. 


4.7. 


4.8. 


4.9. 


4.10. 


4.11. 


4.12. 


4.13. 


4.14. 


Draw a logic diagram for the parallel adder described by terminal statements (4.3) and 
(4.4). ut. 


Complete the logic diagram in Fig. 4.5 to also show the sum terminals of the parallel 
adder. 


Describe by a terminal statement a parallel subtracter, and draw a logic diagram for 
the parallel subtracter. 


Describe by a terminal statement a parallel adder-subtracter which adds or subtracts 
when input terminal C is O or 1, respectively. 


By using an algorithmic language such as Fortran, write a program to simulate the 
parallel adder described by terminal statements (4.3) and (4.4). 


Let X and Y be the multiplicand and the multiplier, respectively. Show the fixed-point 
multiplication step-by-step in the registers in the manner as shown in Fig. 4.10 for the 
following cases: 

(а) X = +101010 and Y = —111001 

(b) X = —011101 and Y = —011110 

(c) X = +110101 and Y = +010111 


Let X and Y be the dividend and the divisor, respectively. Show the fixed-point 
division step-by-step in the registers in the manner as shown in Fig. 4.13 for the 
following cases: 

(a) X = —10001010 and Y — +1010 

(b) X — 4-11001111 and Y — —1010 

(c) X = +11111111 and Y = +1001 


By using an algorithmic language, write a program to test 

(a) the fixed-point addition and subtraction described in Figs. 4.6 and 4.7 
(b) the fixed-point multiplication described in Figs. 4.8 and 4.9, and 

(c) the fixed-point division described in Figs. 4.11 and 4.12. 


Revise the sequence charts in Figs. 4.6 and 4.7 for the fixed-point addition and sub- 
traction where subtraction is performed by addition of 2's complement of the sub- 
trahend. 


Revise the sequence charts in Figs. 4.6 and 4.7 for the fixed-point addition and sub- 
traction for binary numbers in the signed 2's complement representation. 


Revise the sequence charts in Figs. 4.8 and 4.9 for the fixed-point multiplication for 
binary numbers in the signed 2's complement representation. 


Revise the sequence charts in Figs. 4.11 and 4.12 for the fixed-point division for binary 
numbers in the signed 2's complement representation. 


How many execution cycles are required for the fixed-point multiplication described by 
CDL statements (4.12)? How can the multiplication process be shortened? 


How many execution cycles are required for the fixed-point division described by CDL 
statements (4.13)? How can the division process be shortened? 


Problems 167 


4.15. By using an algorithmic language such as Fortran, write a program to simulate the 
parallel adder described by terminal statement (4.37). 


4.16. Let registers AC, SR, and MQ be double rank registers. Revise the multiplication 
sequence (4.12) accordingly. 


4.17. Repeat Problem 4.16 for the division sequence (4.13). 


4.18. If the control memory in the control configuration (4.67) is replaced by the first 256 
locations of the main memory, revise the control configuration and, if necessary, the 
timing and control signals. Show the control word format, the sequence description, 
and the microprogram. 


When the range of numbers that occurs during а calculation becomes very large 
or very small, it is difficult and time consuming for the programmer to keep track 
of the binary (or decimal) points of all the numbers throughout the calculation. 
Therefore, it is very desirable to represent numbers by hardware in an exponent 
form. When numbers are expressed in this form, they are called floating-point 
numbers. The arithmetic to handle addition, subtraction, multiplication, and divi- 
sion of such floating-point numbers is called floating-point arithmetic. The imple- 
mentation of floating-point arithmetic frees the programmer from the burden of 
scaling the numbers during a calculation. 

This chapter describes a parallel, binary arithmetic unit which is capable of 
performing floating-point addition, subtraction, multiplication, and division. Com- 
pared with the previously discussed fixed-point arithmetic unit, the floating-point 
arithmetic unit is more complex since more functions are required. 


A Parallel, Binary, 
Floating-point Arithmetic Unit 5 


5.1 Configuration of the Arithmetic Unit 


A description of the arithmetic unit begins with the floating-point number format, 
the configuration, and the parallel adder. Those computer elements required for 
sequencing floating-point addition, subtraction, multiplication, and division will be 
described in section 5.5.1. 


5.1.1 Floating-point Number Representation 


A number N may be represented in an exponent form as below. 


N = feret? 
where r = radix 
b = bias (5.1) 


e = exponent 


f= signed number 


The above radix r, bias b, and exponent e are integers, while f is, in most cases, a 
signed fraction. 

For a given number of bits to represent a number, the floating-point represen- 
tation can increase the range of number representation but decrease the significant 
bits of the number. Furthermore, as will be shown, the time required for floating- 
point addition or subtraction is usually longer than that required for fixed-point 
addition or subtraction, but the time for floating-point multiplication or division 
may become shorter than that required for fixed-point multiplication or division. 

In order to describe the floating-point arithmetic unit, the following values and 
numbers are chosen, 


APA 
e = an 8-bit signed integer, 

(5.2) 
b = 200,, 


f = a 28-bit signed fraction (і.е., /f/ <`1) 


Furthermore, the quantity (e + Ь) is called the characteristic of the floating-point 
number, or 


170 


Sec. 5.1 Configuration of the Arithmetic Unit 171 


ch—e--b (5.3) 


Since radix r and bias b are known, each floating-point number can simply be repre- 
sented by exponent e and fraction for, alternatively, by characteristic ch and fraction f. 

The floating-point numbers for the chosen values and numbers above require 
a 36-bit word for their representation, 8 bits for the exponent and 28 bits for the signed 
fraction. The format chosen for the 36-bit floating-point numbers is shown in Fig. 
5.1. In this figure, characteristic c/ and fraction f (instead of e and f) are selected to 


S 1 8 9 35 


Fig. 5.1 A floating-point number format 


TABLE 5.1 The Values of the Characteristic 


CHARACTERISTIC (OCTAL) EXPONENT (DECIMAL) 

Underflow Less than (— 128,0) 

000 --І2810 

0015 —1271o 

177; —1 

2003 0 

2018 +1 

377; +12716 
Overflow Larger than (+127; о) 


represent the floating-point number. The characteristic is an 8-bit unsigned integer 
located at bit positions 1 to 8. The signed fraction is in the signed magnitude repre- 
sentation with its sign bit located at bit position 0 and its magnitude at bit positions 
9 and 35. The binary point is located between bit positions 8 and 9. 


5.1.2 Characteristic Part 


The possible values of the characteristic of a floating-point number represented 
by the format in Fig. 5.1 are shown in Table 5.1. As shown in this table, the character- 
istic is obtained by adding 200, to the value of the exponent. The exponent is O if the 
characteristic is 200,. The exponent is negative if the characteristic is 0-177,. Thus, 
the values of the characteristic range from a minimum of 0 to a maximum of 377,. 

When the value of the characteristic is greater than 377,, the number exceeds 
the upper characteristic limit that the format can represent; this condition 1$ known 
as floating-point overflow. The largest positive number that can be represented by 


172 Chap.5 A PARALLEL, BINARY, FLOATING-POINT ARITHMETIC UNIT 


the format is 
„7777777775 X 2102988 


On the other hand, when the value of the characteristic is smaller than 0, the number 
exceeds the lower characteristic limit; this condition is known as floating-point under- 
flow. The smallest positive number that can be represented by the format is 


000000001, x 2-20%+200 


5.1.3 Fraction Part 


A floating-point number may be represented in many different forms, each with 
a different fraction. A particular representation of a floating-point number is said 
to be in the normal form if the most significant digit of the fraction is nonzero so 
that the magnitude of the fraction is less than 1 but equal to or greater than 1/r. For 
the floating-point number represented in the format of Fig. 5.1, the normal form 
requires that 1 exists at bit position 9 so that the magnitude of the fraction will be 


lcgi (5.4) 


The normal form of a floating-point number is a unique representation. The process 
of changing an abnormal form into a normal form is called normalization. A floating- 
point number with a zero fraction can not be normalized. 

The normal form has an exception which occurs when the fraction is zero. This 
form is called the normal zero. It is defined here as the floating-point number whose 
characteristic and fraction are both 0. 


5.1.4 Configuration 


The parallel, binary, floating-point arithmetic unit consists of three registers 
AC, MQ, and SR, in addition to four single-bit indicators as shown in Fig. 5.2. Each 
of the three registers is divided into two parts, one for the characteristic part and the 
other for the fraction part of a floating-point number. To be specific, subregisters 
SR(CH), AC(CH), and MQ(CH) are the registers for the characteristic parts, and 
subregisters SR(FR), AC(FR), and MQ(FR) are the registers for the fraction parts. 
Register SI stores the sign of the result. Registers CHOV and CHUN indicate floating- 
point overflow and floating-point underflow, respectively. Register DVOV indicates 
the division overflow. 

The configuration in Fig. 5.2 is now described by the following CDL statements. 


Comment, configuration of the parallel floating-point arithmetic unit (5.5) 
Register, AC(S,Q,1-8,9Q,9-35), $accumulator 
SRG, 1-35), $storage register of the memory 


Sec. 5.1 Configuration of the Arithmetic Unit 173 


1 SR(CH) 8 9 SR(FR) 35 


Register SR 
S В 1 АС(СН) 8 n 9 AC(FR) 35 BI MQ(CH) 819  MO(FR) 
Register AC Register МО 


CHOV CHUN DVOV 
SC(0-5) 


Shift counter 


Fig. 5.2 Configuration of a parallel, binary, floating-point arith- 


metic unit 
MQ(S,1-35), $multiplier-quotient register 
SC(0-5), $shift counter 
SI, $sign indicator 
CHOV, $floating-point overflow indicator 
CHUN, $floating-point underflow indicator 
DVOV, $division overflow indicator 


Subregister, АС(СН)=АС(1-8), $characteristic part of AC 
SR(CH)=SR(1-8), $characteristic part of SR 
MQ(CH)=MQ(I-8),  Scharacteristic part of MQ 
AC(FR)=AC(9-35),  $fraction part of AC 
SR(FR)=SR(9-35), $fraction part of SR 
MQ(FR)=MQ(9-35), Sfraction part of MQ 


Comment, parallel adder for the characteristic addition (5.6) 
Terminal, ADD(Q,1-8)=ADSR(Q,1-8)®ADAC(Q, 1-8), 
C(Q,1-7)- ADSR(I-8)&ADAC(1-8)«C(1-8)-- C(1-8)*A DSR (1-8), 
C(8)—0, 


174 Chap.5 А PARALLEL, BINARY, FLOATING-POINT ARITHMETIC UNIT 


Comment, parallel adder for the fraction addition (5.7) 
Terminal, ADD(9Q,9-35)- ADSR(9Q,9-35)DADAC(9Q,9-35) 
C(9Q,9-34) - ADSR(9-35)kADAC(9-35)--ADAC(9-35) 
жС(9-35)-- C(9-35) kADSR(9-35), 
C(35)—0, 2% 


Comment, declaration of terminals Z (5.8) 
Terminal, 7(О,1-8)--0-АС(СН) add 0-SR(CH) 


Just as in the fixed-point arithmetic unit, the above bits AC(Q) and AC(9Q) are 
provided in the floating-point arithmetic unit. Bit AC(Q) stores the carry bit from 
the most significant bit of the characteristic, while bit AC(9Q) stores the carry bit 
from the most significant bit of the fraction. These bits are normally examined after 
an addition or a subtraction for the purpose of determining whether an overflow or 
an underflow occurs. 

There are two parallel adders, one for characteristic addition and the other for 
fraction addition. These two adders are described by the terminal statements (5.6) 
and (5.7), where ADSR(Q,1-8) and ADAC(Q,1-8) are the input terminals, С(О,1-8) 
the carries, and ADD(Q,1-8) the output terminals of the parallel adder for the char- 
acteristic addition. ADSR(9Q,9-35) and ADAC(9Q,9-35) are the input terminals, 
C(9Q,9-34) are the carries, and ADD(9Q,9-35) are the output terminals of the 
parallel adder for the fraction addition. The two parallel adders above can alterna- 
tively be described in a more convenient way by basic operator add. 

In the subsequent description of floating-point arithmetic processes, a test is 
required on one of the output terminals of the parallel adder for characteristic addi- 
tion. In this case, one set of inputs of the parallel adder is connected from subregister 
AC(CH), and the other set of inputs from subregister SR(CH). The output terminals 
of the parallel adder now called Z are declared by the terminal statement (5.8). 


5.2 Floating-point Addition and Subtraction 


The floating-point addition adds the floating-point addend in the SR register 
to the floating-point augend in the AC register. The floating-point subtraction sub- 
tracts the floating-point subtrahend in the SR register from the floating-point minuend 
in the AC register. The floating-point operands initially in registers AC and SR are 
assumed to be in the normal form. Recall that the fractions of these floating-point 
numbers are in the signed magnitude representation. 

Figure 5.3 presents a flowchart showing floating-point addition and subtraction. 
After initialization, the floating-point addition and subtraction first requires the 
alignment of the characteristics of the floating-point numbers, then the addition or 
subtraction of the fractions of the floating-point numbers, and finally the normaliza- 
tion of the sum after addition or of the difference after subtraction. Floating-point 
overflow during addition and floating-point underflow during normalization are 
indicated when they occur. 


Sec. 5.2 Floating-point Addition and Subtraction 175 


Start 


Initialization 


Characteristic 
alignment 


Fraction addition 
or subtraction 


End 


Sum being zero 


Overflow or difference being zero 


Fig. 5.3 Floating-point addition and subtraction 


5.2.1 Initialization 


The sequence chart for the initialization part of the floating-point addition and 
subtraction is shown in Fig. 5.4. There are four tasks. The sign of the subtrahend in 
the SR register is complemented in case of subtraction, so that the subtraction can 
then become addition. Register MQ is reset to 0 because it will hold a portion of the 
fraction in the AC register that may be shifted into register MQ during characteristic 
alignment. The signs of the augend and the addend are compared. If the signs are 
the same, indicator SI is reset to 0; otherwise, it is set to 1. Finally, the sum is deter- 
mined if the augend or the addend or both have a zero fraction. If both fractions are 
zero, register AC is reset to 0 so that the sum is in the form of a normal zero. If only 
the fraction of the addend in the SR register is zero, the floating-point augend in 
the AC register becomes the sum, and nothing further needs to be done. If only the 
fraction of the augend in the AC register is zero, the floating-point addend in the 


176 Chap.5 А PARALLEL, BINARY, FLOATING-POINT ARITHMETIC UNIT 


Start 


SR(S)-SR(SY 


Floating-point add 


Floating-point 
subtract 


E 
AC(FR)=0 


End 


Proceed to characteristic 
alignment 


Fig. 5.4 Sequence chart for the initiation part of floating-point 
addition and subtraction 


SR register becomes the sum; thus, the addend is transferred from the SR register 
to the AC register. If neither of the fractions is zero, the floating-point addition 
process proceeds to characteristic alignment. 


5.2.2 Characteristic alignment 


The addition of two floating-point numbers requires the addition of the fractions 
of these floating-point numbers when they have the same characteristics. If the 
characteristics of these floating-point numbers are not the same, they must first be 


Sec. 5.2 Floating-point Addition and Subtraction 177 


aligned before addition can be accomplished. The sequence chart for the character- 
istic alignment part of the floating-point addition and subtraction is shown in Fig. 5.5. 

In aligning the characteristics of two floating-point numbers, one has the choice 
of either shifting the fraction of the floating-point number with a smaller characteristic 
to the right so that its characteristic value can be increased, or shifting the fraction 
of the floating-point number with a larger characteristic to the left so that its charac- 
teristic value can be decreased. Because register MQ is available for holding shifted 
bits as is required during multiplication, the former is chosen. Since the AC register 
is implemented with a shift logic-network as is also required during multiplication, 
the floating-point number with a smaller characteristic should be placed in the AC 
register. Therefore, characteristic alignment begins with the comparison of the two 
characteristics. If they are equal, no alignment is required. If the characteristic in 
the AC register is smaller than that in the SR register, then the floating-point number 
in the AC register is ready for alignment. Otherwise, the floating-point numbers 
in the AC and SR registers are exchanged. In either case, the casregister which is 
formed by cascading bit AC(9Q) with the fraction parts of the AC and MQ registers 
is shifted to the right one bit position, and at the same time the characteristic part 
of the AC register is incremented by one. 

The above comparison of characteristics is accomplished by a subtraction test. 
This subtraction is replaced by addition of 175 complement of the subtrahend. There- 
fore, the contents of subregister AC(Q,CH) are 1’s complemented before the test. 
They remain in the 1’s complement form for use in subsequent subtraction tests 
until the characteristics are aligned; at that time, they are restored to their original 
value. Notice that it is necessary for subregister AC(Q,CH) to be 1’s complemented 
before and after the numbers in registers AC and SR are exchanged. 

There are five micro-operations involved in aligning the characteristics. Cas- 
register AC(9Q,FR)-MQ(FR) is shifted one bit position to the right and, at the same 
time, the characteristic in subregister AC(Q,CH) and the count in shift counter SC 
are both decremented by one. Subregister AC(Q,CH) is decremented instead of being 
incremented, because the contents of subregister AC(Q,CH) are in the 1’s comple- 
ment form. Subregister AC(Q,CH) is tested for zero so that the sequence can exit as 
soon as the characteristics are aligned. Counter SC is tested for zero so that the 
sequence can exit when right-shifts reach the maximum of 27 times. 


5.2.3 Fraction Addition and Subtraction 


The sequence chart for the addition or subtraction of the fractions of two floating- 
point numbers is shown in Fig. 5.6. This sequence is similar to that for the fixed-point 
numbers. As shown in this figure, addition or subtraction is revealed by the value of 
register SI. If SI is 0, the signs of the two operands are the same and an addition is 
required; otherwise, a subtraction is required. 

If an addition is required, the fraction part of the SR register is added to that of 
the AC register. Bit AC(9Q) is then tested. If it is 1, casregister AC(Q,FR)-MQ(F R) 
is shifted one bit position to the right so that the sum remains a fraction. This step 
simplifies the subsequent normalization because only left-shifting of the casregister 


Characteristic 
alignment 


ACICH)-AC(CH)' 
SC+33, 


#1 SR(CH)<AC(CH) 


AC(CH)-AC(CH)' 


AC(CH)-AC(CH)' 


AC(9Q, FR)-MQ(F R)-shr AC(9Q, FR)-MO(FR), 
АС(О, СН)=соиптап AC(Q, CH), 
SC-countdn SC 


AC(CH)<AC(CH)’ 


Proceeds to fraction 
addition part , 


- 


Fig. 5.5 Sequence chart for the characteristic alignment рам 
of floating-point addition and subtraction 


“ 


178 


Fraction addition 
part 


АС(90, ЕН)<-АС(90, FR) add O-SR(FR) 


AC(9Q, FR)<AC(9Q, FR)’ AC(9Q)=1 


АС(90, FR)~AC(9Q, FR) add O-SR(FR) 


AC(9Q, FR)-MQ(FR)<shr AC(9Q, FR)-MQ(FR) 
AC(Q, СН) *countup AC(Q, CH) 


ACICH) 3775] 


CHOV<1 


AC(9Q, FR)<AC(9Q, FR)’ 


Proceeds to 
normalization 


End 


Fig. 5.6 Sequence chart for the fraction addition part of float- 
ing-point addition and subtraction 


179 


180 Сһар.5 А PARALLEL, BINARY, FLOATING-POINT ARITHMETIC UNIT 


is needed at that time. At the same time, the characteristic in subregister AC(Q,CH) 
is incremented and is then followed by a floating-point overflow test. If overflow 
occurs, it is indicated. Whether or not the overflow occurs, the floating-point process 
is terminated. 

If a subtraction is required, the fraction part in the AC register is 1’s comple- 
mented so that the subtraction can Бе replaced by the addition of 1’s complement of 
the subtrahend. The fraction part in the SR register is now added to that in the AC 
register. Bit AC(9Q) is next tested. If it is 1, it indicates that the fraction part in the 
SR register was larger than the fraction part in the AC register. In this case, an end- 
around carry is added to the fraction sum in subregister AC(9Q,FR), and sign AC(S) 
is replaced by sign SR(S) because the difference takes the sign of the larger operand. 
If bit AC(9Q) is 0, this indicates that the fraction part in the SR register was equal 
to or smaller than that in the AC register. The fraction part now in the AC register 
is next tested to see whether it contains all 1’s. If it contains all 1’s, the difference is 
a zero; as a result, register AC is reset to 0. If it does not contain all 1’s, the difference 
is negative; the fraction part in the AC register is once more 1’s complemented to 
give the true magnitude of the difference. 

At the end of addition or subtraction, the most significant part of the result is 
in subregister AC(S,FR), and the least significant part in subregister MQ(FR). If the 
sum is not overflown after addition or if the difference is not a normal zero after 
subtraction, it then proceeds to normalization. 


5.2.4 Normalization 


As mentioned before, the process of changing a floating-point number into the 
normal form is called normalization. The sequence chart for the normalization part 
of floating-point addition and subtraction is shown in Fig. 5.7. 

The normalization of a floating-point number in the format of Fig. 5.1 requires 
the presence of a 1 in the 9th bit position. Therefore, bit AC(9) is tested. If it contains 
a 1, the number is in the normal form. If it contains a 0, normalization is required. 

The normalization process begins by setting shift counter to 27. It then shifts 
casregister AC(9Q,FR)-MQ(FR) one bit position to the left and, at the same time, 
decrements both subregister AC(Q,CH) and shift counter SC by one. Shift counter 
SC is next tested for 0. If it contains а 0, the floating-point number is 0 and registers 
AC and MQ are both reset to 0. Otherwise, bit AC(Q) is tested for possible floating- 
point underflow. If there is an underflow, register CHUN is set to 1; both registers 
AC and МО are also reset to 0, and this process is terminated. If neither counter SC 
contains a 0 nor any underflow occurs, bit AC(9) is again tested. The casregister is 
again left-shifted. Subregister AC(Q,CH) and counter SC are again both decremented. 
Counter SC and bit AC(Q) are again tested for 0 and 1, respectively. These steps are 
repeated until a 1 appears in bit AC(9). At this time, the sum is in the normal form. 
The fraction part in subregister AC(FR) 1$ now rounded off by adding 1 to subregister 
AC(9Q,FR) if MQ(9) is 1 and register MQ is then reset to 0. The process of floating- 
point addition and subtraction is now completed. 


бес. 5.3 Floating-point Multiplication 181 


Normalization 
part 


MQ(9)=1 


АС(90, FR)-MQ(FR)<sh!l АС(90, FR)-MO(FR), 
AC(Q, CH) *countdn AC(Q, СН}, 
SC<countdn SC, 


AC(9Q, FR) -AC(90, FR) add 1 


AC(Q)=1 
CHUN-1 


End 


Fig. 5.7 Sequence chart for the normalization part of floating- 
point addition and subtraction 


In the normalization above, only left-shifting of the casregister is needed because 
there is no 1 at bit AC(9Q) owing to the one-bit right-shift which occurs during the 
previous fraction addition as shown in Fig. 5.6. 


5.3 Floating-point Multiplication 


Floating-point multiplication multiplies a floating-point multiplier in the MQ 
register by a floating-point multiplicand in the SR register and produces a product 
in the casregister combined from the AC and MQ registers with the most significant 


182 Chap.5 A PARALLEL, BINARY, FLOATING-POINT ARITHMETIC UNIT 


part in the AC register and the least significant part in the MQ register. The floating- 
point operands in registers SR and MQ are assumed initially in the normal form, and 
the fractions of these floating numbers are in the signed magnitude representation. 
The flowchart in Fig. 5.8 shows the process of floating-point multiplication. 
The sign of the product is first determined. Since a zero fraction in the multiplicand 


4. 


Зам 


Set product sign 


Handle zero Zero multiplicand or 
multiplicand and 


zero multiplier 


zero multiplier 


Characteristic Fraction 
addition multiplication 


Normalization 


End 


Fig. 5.8 Floating-point multiplication. 


Sec. 5.3 Floating-point Multiplication 183 


or in the multiplier produces a zero product, the multiplicand and the multiplier are 
now tested for zero fraction. If a zero fraction is found, the product is set to 0, and 
the multiplication process is terminated. Otherwise, it proceeds to floating-point 
multiplication. 

Floating-point multiplication is accomplished by adding the characteristic parts 
in the SR and MQ registers and then multiplying the fraction part in the SR register 
by that in the MQ register. As shown in Fig. 5.8, these addition and multiplication 
processes can proceed simultaneously. After characteristic addition and fraction 
multiplication are completed, the product is normalized and then rounded off. 


5.3.1 Initiation 


The initiation part of the floating-point multiplication is shown in the sequence 
chart in Fig. 5.9. As shown, the sign of the product is 0 if the signs of the multiplicand 
and the multiplier are the same; it is 1 if they are different. The sign of the product is 
placed at both sign bits AC(S) and MQ(S). If the multiplicand or the multiplier has 
a zero fraction, the product is zero, and both registers AC and MQ are reset to 0 
so that the product can be a normal zero. If neither is zero, the floating-point multi- 
plication proceeds to characteristic addition and fraction multiplication. 


5.3.2 Characteristic Addition 


Characteristic addition adds the characteristic of the multiplicand in subregister 
SR(CH) and the characteristic of the multiplier in subregister MQ(CH) to give the 
characteristic of the product in subregister AC(CH). Characteristic addition is shown 
in the sequence chart in Fig. 5.10. The characteristic of the multiplier in subregister 
МО(СН) is first transferred to subregister AC(CH) because the addition is performed 
in the AC register. Then, the characteristic of the multiplicand in subregister SR(CH) 
is added to the characteristic of the multiplier now in subregister AC(Q,CH). This 
sum must be subtracted by quantity 200, because the characteristic is obtained by 
biasing the exponent by 200,. The characteristic in subregister AC(CH) now includes 
the sum of two biases. The subtraction of 200, is equivalent to the addition of its 
2’s complement, or 600,. Bit AC(Q) is next tested. If it contains a 1, this indicates 
that the characteristic in subregister AC(CH) is greater than 377,; thus, a floating- 
point overflow occurs. When an overflow occurs, overflow indicator CHOV is set 
to 1, and the process of floating-point multiplication is terminated. If bit AC(Q) 
contains a 0, the process of floating-point multiplication proceeds to normalization. 


5.3.3 Fraction Multiplication 


While the characteristics of the multiplicand and the multiplier are being added, 
their fractions are being multiplied. The sequence chart in Fig. 5.11 shows the multi- 
plication’ of the two fractions. As shown in Fig. 5.11, the multiplication of the frac- 


184 Chap.5 А PARALLEL, BINARY, FLOATING-POINT ARITHMETIC UNIT 


Start 


АС(5)<1, 
MQ{S)<1, 


АС(5)<+0, 
MQ(S)<0, 


Proceeds to characteristic 
addition 


Proceeds to fraction 
multiplication 


End 


Fig. 5.9 Sequence chart for the initiation part of floating-point 
multiplication 


tions is carried out as the multiplication of two fixed-point numbers; repeated-addition 
method is used for this multiplication. 

In Fig. 5.11, shift counter SC is initially set to 27 because there are 27 bits in 
the fraction, and subregister AC(9Q,FR) is initially reset to 0. Bit MQ(35) is then 
tested for 1. If it is a 1, the two fractions are added; otherwise, there is no addition. 
In either case, casregister AC(9Q,FR)-MQ(FR) is shifted one bit position to the 
right and, at the same time, counter SC is decremented by one. Counter SC is next 
tested for 0. If it does not contain a 0, the four steps of testing bit MQ(35), addition, 


Sec. 5.3 Floating-point Multiplication 185 


Characteristic 
addition 


АС(О, CH)-0-MO(CH) 


АС(О, СН)<-АС(О, CH) add 0-SR(CH) 
AC(Q, CH)+AC(Q, CH) add 600g 


Proceeds to 
normalization 


CHOV-«1 


End 


Fig. 5.10 Sequence chart for characteristic addition part of 
floating-point multiplication 


or no addition, right-shifting, and testing counter SC are repeated until shift counter 
SC contains a 0. At this time, the process of floating-point multiplication proceeds to 
normalization. 


5.3.4 Normalization 


The product is now in casregister AC(9Q,FR)-MQ(FR) with the most signifi- 
cant part in subregister AC(FR) and the least significant part in subregister MQ(FR). 
The normalization of the product can make use of the algorithm described in the 


186 Chap.5 A PARALLEL, BINARY, FLOATING-POINT ARITHMETIC UNIT 


Fraction 
multiplication 
қ SC-333 
` АС(90, FR)<0 
MQ(35)=1 
АС(90, FR)-AC(9Q, FR) add O-SR(FR) 


АС(90, FR)-MQ(FR)<shr AC(9Q, FR)-MO(FR), 
SC<countdn SC 


: 


Proceeds to 
normalization 


Fig. 5.11 Sequence chart for the fraction multiplication part 
of floating-point multiplication 


sequence chart in Fig. 5.7. The possiblibity of a normal zero in the AC register at 
this time does not exist because the presence of a zero fraction in the multiplicand 
or in the multiplier has already been tested in the initiation part as shown in the 
sequence chart of Fig. 5.9. 


5.4 Floating-point Division 


Floating-point division divides a floating-point dividend in the AC register by a 
floating-point divisor in the SR register and produces a floating-point quotient in 
the MQ register and a remainder in the AC register. Register MQ is initially reset 
to zero. The floating-point operands in registers AC and SR are initially in the normal 
form, and the fractions of these floating-point numbers are'in the signed magnitude 
representation. The quotient in the MQ register will be in the normal form. 

Figure 5.12 shows the floating-point division. The signs of the quotient and the 
remainder are first determined. If the dividend is zero, the result is zero. If the divisor 


Sec. 5.4 Floating-point Division 187 


Start 


Set signs of quotient 
and remainder 


DVOV or zero dividend 


Handle zero divisor 
or zero dividend 


Dividend 
alignment 


Fraction 


division 


Characteristic 
subtraction 


End 


Fig. 5.12 Floating-point division 


is zero, a division overflow occurs. In either case, the division process is then termi- 
nated. If neither case happens, floating-point division begins. It begins by first aligning 
the dividend, then followed by subtracting the characteristic of the divisor from the 
characteristic of the dividend, and dividing the fraction of the dividend by the fraction 
of the divisor. As shown in Fig. 5.12, the characteristic subtraction and fraction 
division can proceed at the same time. 


5.4.1 Initiation 


The initiating part of the floating-point division is shown in the sequence chart 
in Fig. 5.13. As shown, register MQ is first reset to 0, and the sign of the quotient is 


188 Chap.5 А PARALLEL, BINARY, FLOATING-POINT ARITHMETIC UNIT 


Start 


Proceeds to 
dividend 
alignment 


End 


Fig. 5.13 Sequence chart for the initiation part of floating-point 
division 


then determined and stored in bit MQ(S). Bit AC(S), the sign of the dividend, is left 
unchanged as the sign of the remainder. 

As shown in Fig. 5.13, subregister SR(FR) is tested for 0. If it is zero, the quotient 
can not be determined because the divisor has a zero fraction; division overflow 
indicator DVOV is set to 1 and the division process is terminated. If the fraction 
part in subregister SR(FR) is not 0, subregister AC(FR) is then tested for 0. If it is 
zero, the dividend has a zero fraction and the quotient is zero; registers AC and MQ 
are both reset to 0 and the division process is terminated. If neither subregister SR(FR) 
nor subregister AC(FR) is zero, the division process proceeds to dividend alignment. 


5.4.2 Dividend Alignment 


Before the division of the fractions of two floating-point numbers begins, the 
dividend is properly aligned with respect to the divisor so that the quotient will be 
in the normal form. The dividend alignment is shown in the sequence chart in Fig. 
5.14. The proper alignment requires that the fraction of the divisor in subregister 


Sec. 5.4 Floating-point Division 189 


Dividend 
alignment 


AC(9Q, ЕН)<АС(90, FR)’ 


SR(FR)>AC(FR) 


# | SR(FR)<AC(FR) 


AC(9Q, FR)-MQ(FR)<1-AC(9Q, 9-34)-AC(35)’-M(9-34) 
AC(Q, CH)<countup AC(Q, СН) 


CHOV<+1 Proceeds to 
division 


End 


Proceeds to 
characteristic 
subtraction 


Fig. 5.14 Sequence chart for the dividend alignment part of 
floating-point division 


SR(FR) be larger than that of the dividend in subregister AC(FR). The two fractions 
are compared by a subtraction test. The subtraction is actually substituted by the 
addition of 1’s complement of the fraction part in subregister AC(9Q,FR). After 
the addition, if terminal Z(0) defined by (5.8) is 1, then the fraction part in subregister 
SR(FR) is greater than the fraction part in subregister AC(FR), and the process 
proceeds to both characteristic subtraction and fraction division. If terminal Z(Q) 
is 0, then the fraction part in subregister SR(FR) is less than or equal to that in sub- 
register AC(FR). In this latter case, the dividend is shifted one bit position to the 
right so that the fraction part in subregister SR(FR) becomes larger than the fraction 
part in subregister AC(9Q,FR). During the right shift, a 1 is inserted into bit AC(9Q) 
because subregister AC(9Q,FR) contains the Гз complement of the dividend or 


190 Chap.5 A PARALLEL, BINARY, FLOATING-POINT ARITHMETIC UNIT 


that of the partial remainder. At the same time, the characteristic part in subregister 
AC(Q,CH) is incremented by one. Bit AC(Q) is then tested for a floating-point 
overflow. If an overflow occurs, the floating-point division process is terminated; 
otherwise, it proceeds to both characteristic subtraction and fraction division. 

In Fig. 5.14, the fraction of the dividend is twice less than the fraction of 
the divisor; this is the case if the dividend and the divisor are initially in the normal 
form. If it is twice as large or larger, the division overflow indicator CHOV is set to 
1 and the division process is terminated. The fraction part in subregister AC(FR) 
remains unchanged. 


5.4.3 Characteristic Subtraction 


Characteristic subtraction subtracts the characteristic of the divisor in sub- 
register SR(CH) from the characteristic of the dividend in subregister AC(Q,CH) to 
give the characteristic of the quotient in subregister MQ(CH). Characteristic sub- 
traction is shown in the sequence chart in Fig. 5.15. Since subtraction is to be carried 
out by addition of the 1’s complement of the subtrahend, the characteristic of the 
divisor in subregister SR(CH) is first l's complemented, and is then added to the 
characteristic of the dividend in subregister AC(Q,CH). The result is in subregister 
AC(Q,CH). 

Bit AC(Q) is now tested. If bit AC(Q) is 1, the characteristic of the dividend is 
greater than the characteristic of the divisor, and the addition of endaround carry 
follows. Bias 200, is next added to the results. The characteristic in subregister 
АС(СН) is now transferred to subregister MQ(CH), and the characteristic of the 
remainder is set at 27 less than the characteristic of the dividend; this subtraction is 
done by addition of the 2's complement of 33, (or 745,). 

If bit AC(Q) is 0, the characteristic of the dividend is equal to or less than the 
characteristic of the divisor. In the former case, the sum (or rather the difference) 
in subregister AC(Q,CH) is zero; it is then added by bias 200,. In the latter case, a 
floating-point underflow occurs; underflow indicator CHUN is set to 1 and sub- 
registers AC(CH) and MQ(CH) are both reset to 0. 

No matter whether bit AC(Q) is 1 or 0, the characteristic of the divisor is 175 
complemented again so that it can be restored to its original value. This completes 
the process of characteristic subtraction. 


5.4.4 Fraction Division 


The sequence chart for the fraction division is shown in Fig. 5.16. As shown, 
the division of the fractions of two floating-point numbers is similar to the previously- 
described fixed-point division. The division begins by setting shift counter SC to 27 
for the 27 fraction bits. Recall that the fraction part in subregister AC(9Q,FR) has 
been 1’s complemented during the dividend alignment (see Fig. 5.14). Thus, terminal 
Z(Q) indicates the result of a difference. Since the fraction part of subregister SR(SR) 


Characteristic 
subtraction 
SR(Q, CH)<SR(Q, CH)’ 
АС(О, CH)-AC(OQ, СН) add O-SR(CH) 
Pa = 
АС(О)-1 


AC(Q, СН)-777; 


# | AC(CH)<SR(H) 


CHUN+1 


SR(Q, CH)<SR(Q, CH)’ 


End 


AC(Q, CH)+AC(Q, CH) add 200, 
АС(СН)<0, 
МО(СН)<0, 


Fig. 5.15 Sequence chart for the characteristic subtraction part 
of floating-point division 


191 


Fraction 
division 
SC-338 


AC(9Q, ЕВ)-МО(ЕВ)=АС(ЕВ)-МО({9)'-МО(10-35)-0 


2(Q) =1 E 
аат 


# | ЅА(ЕВ) <АС(ЕВ) 


MQ(35)<1 


AC(9Q, ЕВ)-АС(9О, FR) add 0-SR(FR) 


SC<countdn SC 
+ 
AC(9Q, ЕВ)-АС(9О, FR)’ 


End 


Fig. 5.16 Sequence chart for the fraction division part of 
floating-point division 


192 


Sec. 5.5 CDL Description 193 


at this time is larger than that part in subregister AC(FR), casregister AC(9Q,FR)— 
MQ(FR) is then shifted one bit position to the left. Bit MQ(9) is complemented during 
this shift because the fraction part in subregister AC(9Q,FR) is kept during the 
division in the 1’s complemented form. The fraction part in subregister AC(9Q,FR) 
is next compared with that in subregister SR(FR). If 0 occurs in bit AC(9Q), the 
fraction in AC(FR) is larger than or equal to that in SR(FR); in this case, a 1 is 
inserted into bit M(35) and the fraction parts in the AC and SR registers are added 
to each other. Otherwise, there is no insertion of 1 and no addition of fractions. In 
either case, shift counter SC is decremented by one and then tested for zero. If shift 
counter SC does not contain a 0, the above steps of left-shifting, testing, inserting, 
adding, and decrementing are repeated until shift counter SC contains a 0. At this 
time, the fraction part in subregister AC(9Q,FR) is restored by being 175 comple- 
mented. The division process is now completed. 


5.5 CDL Description 


The algorithms for floating-point addition, subtraction, multiplication, and divi- 
sion have been described by sequence charts. They are now described in the CDL 
statements. The timing and control signals are first described, followed by the descrip- 
tion of the three sequences. 


5.5.1 Timing and Control Signals 


The configuration for sequencing the floating-point arithmetic processes is almost 
identical to that for sequencing the fixed-point arithmetic process. It is described 
below: 


Comment, configuration for sequencing the floating-point arithmetic unit (5.9) 


Register, D(0-3), $timing register 
AD(0-14), $address register of the memory 
OPCODE (0-3), Sop-code register 
WC(0-3), $wait counter 
I, $when 1, fetch cycle; when 0, execution cycle 
READ, $inemory read command 
SI, $sign indicator 
N, $control register 
T, $control register 
W, $control register 


Ү, $control register 


194 Спар. 5 A PARALLEL, BINARY, FLOATING-POINT ARITHMETIC UNIT 


Memory, M(AD)=M(0-32767,0-35), 
Decoder, K(0-11)—D, 

J(0-11) ОРСОРЕ, 
Switch, START(ON), 
Clock, Р(1-2), 
Terminal, ADS=(J(0)+J())«I, 


MPY =J(2)«I, 
DIV —J(3)«I, 
FAS—J(4)--J(5)«1, 
FMP=J(6)«I, 
FDV=J(7)«I, 
FA—J(8)«1, 
NO=J(9)«I, 
CS—J(10)«I, 
/START(ON)/ D-90, 
/P(2)/ D<countup D, 


In the description above, register T, counter WC, and new commands are additionally 
declared. Command terminals FAS, FMP, and FDV command, respectively, the 
floating-point addition and subtraction sequence, the floating-point multiplication 
sequence, and the floating-point division sequence. Command terminals FA, NO, 
and CS are auxiliary commands for commanding auxiliary sequences and subse- 
quences. 

In the subsequent descriptions of arithmetic processes, the time for the 12 steps 
of an execution cycle is again chosen to be equal to the memory cycle time, and each 
execution cycle is chosen to coincide with each memory cycle. This choice is made 
so as to keep the memory and the arithmetic unit both in operation one cycle after 
another. It is again assumed that, when the memory is commanded to read during 
the first step, the word from the memory becomes available in register SR at the end 
of the fourth step. 


5.5.2 Floating-point Addition and Subtraction 


The floating-point addition and subtraction process has been described in the 
sequence charts in Figs. 5.4 and 5.7. Compared with the fixed-point control signals 
process, the floating-point addition and subtraction sequence is more complex because 
more functions, such as normalization and characteristic alignment, are required. 
Auxiliary command FA is employed. 

The sequence chart for the initiation has been shown in Fig. 5.4. The control 


Sec. 5.5 CDL Description 195 


signals for the 6 steps of this subsequence аге FAS«K(0)«P(1), ..., and FAS«K(5) 
*P(1). The CDL description of the initiation is shown below. 


Comment, floating-point addition and subtraction (5.10) 


Comment, initialization (5.11) 
/FAS«K(0)«P(1) READ<1, 
/FAS«K(1)«P(1) | N—0,T-—0,W.—0,Y-—0,WC-—0, 
[ЕАЅ»*К (2)*Р()/ | MQ-—0, 
[ЕАЅ*К (3)*Р(1)/  SR—M(C), 
/FAS*K(4)*P(1)/ IF (ОРСОРЕ= 1) THEN (SR(S)<-SR(S)’), 
/FAS*K(5)*P(1)/. ТЕ (SR(S)—AC(S)) THEN (SI —0) ELSE (SI—1), 

IF (AC(FR)40) THEN 

(IF (SR(FR)40) THEN (М«-1,АС(СН)-АС(СН), 

SC«—27) ELSE (Y —1)), 

IF (AC(FR)=0) THEN 

(IF (SR(FR)=0) THEN (АС<-0) ELSE (AC-—SR)), Y —1, 
/FAS«K(11)«P() ТЕ (Y=1) THEN (Y-——0,1—0,D-—15) 


In the description above, if no zero fraction is found during the sixth step, the follow- 
ing micro-operations are activated to initiate the characteristic alignment subsequence: 


М«-1, AC(CH)—AC(CHy, SC—27, 


Otherwise, register Y is set to 1 and the floating-point addition and subtraction 
sequence is terminated at the last step of this execution cycle. 

The sequence chart for the characteristic alignment has been shown in Fig. 5.5. 
The control signals for this subsequence are N«FAS«K(6)«P(1), ..., and N«FAS 
*К(10)*Р(1). To enter this part, register М is set to 1; and to leave this part, register 
М is reset to 0. There are two subsequences, one controlled by register T and the 
other by register W. If the two characteristics are initially aligned, these two sub- 
sequences are bypassed; this is accomplished by leaving registers T and W both 0. 
The loop for repeatedly shifting casregister AC(9Q,FR)-MQ(FR) to the right is 
accomplished by setting register D to 8 so that the next clock P(2) will set register 
D to 9. The CDL description of the characteristic alignment is shown below. 


Comment, characteristic alignment (5.12) 
/NxFASxK(6)«P(1)/ ТЕ (Z(1-8)=377,) THEN (AC(CH)-——AC(CH)) 
ELSE (IF (Z(Q)41) THEN (АС(СН)-АС(СНУ, 
T—1) ELSE (W<1), 


196 Chap.5 А PARALLEL, BINARY, FLOATING-POINT ARITHMETIC UNIT 


/[/T«N*«FAS«K(7)&P(1 | SR——AC, AC—SR, 

[/V«N*FAS«K(8)«P(1/ | AC(CH)—AC(CH), T—0,W-—1, 

/W*xNF*AS«K(9)&«P(1)) . AC(9Q,FR)-MQ(FR)-—shr AC(9Q,FR)-MQ(FR), 
AC(Q,CH)<countdn AC(Q,CH), 
SC<—countdn SC, 

/W*N«xFAS+K(10)*P(1)/ IF (SC=0)+4 (Z(1-8)=377,)) 
THEN (AC(CH)—AC(CH)',W-—0,DO WAIT) 
ELSE (D<8), 

/N*FAS«K(11)«P(1)/ IF (WC=0) THEN (N—0,D-—15,0PCODE-—8) 
ELSE (D-——10,WC-—countdn WC) 


In the description above, if the characteristics are initially aligned, counter WC 
is reset to 0 during the first step, and registers М and T remain 0. Thus, the micro- 
operations during the eighth to eleventh steps are all bypassed, and the characteristic 
alignment subsequence is terminated during the last step. 

The characteristic alignment above can be completed at the first, third, fifth, 
seventh, ninth, or eleventh step of the execution cycle. It is desirable and, in most 
cases, necessary to have the alignment terminated during the last step of the execution 
cycle. Counter WC and the block statement shown below are provided for this 


purpose. 


Block, WAIT (IF ((SC=26)+(SC=20)-+(SC=14)+(SC=8)+(SC=2)) (5.13) 
THEN (\С+ 0), 
ТЕ ((SC=25)+(SC=19)+(SC=13)+(SC=7)-+(SC=1)) 
THEN (WC--10), 
IF (SC=24)+(SC=18)-+(SC=12)+(SC=6)-+(SC=0)) 
THEN (WC-—8), 
IF (SC=23)+(SC=17)+(SC=11)-+(SC=5)) 
THEN (WC-—6), 
IF ((SC=22)+(SC=16)+(SC=10)+(SC=4)) 
THEN (WC<4), 
IF ((SC=21)+(SC=15)-+(SC=9)+(SC=3)) 
THEN (WC<2)), 


The block statement above defines a block of micro-operations called WAIT 
which sets counter WC to an appropriate value (0, 2, 4, 6, 8, or 10). When the sub- 
sequence reaches the last step, counter WC is decremented as described in the last 
statement in (5.12). This statement forms a one-statement loop. A precise number 
of steps will have elapsed so that when counter WC becomes 0, it is at the end of 


Sec. 5.5 CDL Description 197 


that execution cycle. At this time, register N is reset to 0 and register OPCODE is 
set to give the auxiliary command FA. 

The sequence chart for the fraction addition and subtraction as well as the 
normalization has been shown in Figs. 5.6 and 5.7, respectively. The control signals 
аге ҒА«К(О)жР(1),..., FA*K(5)*P(1), and ЕА»К(9)*Р(1),... , ЕА*К(11)*Р(1), 
where the command is the auxiliary command FA. The first three steps form the 
fraction addition subsequence; the second three steps, the fraction subtraction sub- 
sequence; and the last three steps, the normalization subsequence. The CDL descrip- 
tion of the fraction addition subsequence and the fraction subtraction subsequence 
is shown below. 


Comment, fraction addition (5.14) 
/SI’xFA*K(0)*P(1)/  AC(9Q,FR)-——AC(Q,FR) add 0-SR(FR), 
/SI’*FAxK(1)*P(1)/ IF (AC(9Q)<1) THEN (Y —1), 
ELSE (AC(9Q,FR)-MQ(FR)-—shr AC (9Q,FR)-MQ(EFR), 
AC(Q,CH)-—countup AC(Q,CH),T<1), 
/[SY«FA«K(2)«P(D/ IF (AC(Q)=1)*(T=1)) THEN (CHOV —1), T——0, Y<1, 
Comment, fraction subtraction 
/SIxFA*K(3)*P(1)/ АС(90,ЕЕ)-АС(90,ЕЕ), 
/SK«FA«K(4)«P(1) АС(90,ЕЕ)-АС(90,ЕЕ) add 0-SR(FR), 
/SIXFA«K(S)&P(1/ ТЕ (AC(9Q)—1) THEN 
(AC(9Q,FR)-——AC(9Q,FR) add 1, AC(S).—SR(S),N-—1), 
IF (AC(9Q)z:1) THEN 
(IF (AC(FR)—7...7) THEN (AC—O0,Y —1) 
ELSE (AC(9Q,FR)-—AC(9Q, FR),N-—1)), 
/FA*K(11)«P(1)/ IF (Y=1) THEN (Y —0,I —0,D —1:5), 
In the description above, the fraction addition subsequence 1$ entered if register 
SI has been reset to 0; otherwise, the fraction subtraction subsequence is entered. 
During either of these two subsequences, register T is set to 1 if the floating-point 


addition and subtraction process is terminated, or register М is set to 1 if the normal- 


ization subsequence is initiated. 
The control signals for the normalization subsequence are N«FA«K(9)«P(1), 
..., and N«FA«K(11)«P(1). The CDL description is shown below. 


Comment, normalization (5.15) 
/N*xFA*K(8)«P(1)/ 5С--27, 
/N«xFA*xK(9)*P(1)/ IF (АС(9)=1) THEN (MQ), 


IF (MQ(9)=1) THEN 
(AC(9Q,FR)-—AC(9Q,FR) add 1)), 


198 Chap.5 А PARALLEL, BINARY, FLOATING-POINT ARITHMETIC UNIT 


IF (AC(9)41) THEN (AC(9Q,FR)-MQ(FR)-—shl 
AC(9Q,FR)-MQ(FR), 
AC(Q,CH)<countdn AC(Q,CH), 
SC<—countdn SC, W —1), 
/WsN*FAsK(10)&P(1)) IF ((SC+0)*(AC(Q)#1)) THEN (D-—8), 
IF ((SCz£0)«(AC(Q)— 1) THEN (CHUN<-1,AC<0, 
MQ-—0,W-—0,DO WAIT), 
IF (SC=0) THEN (AC-——0,MQ-—0,WC-—8), 
/N*xFA*K(11)*P(1)/ ТЕ (WC=0) THEN (1—0,D—15) 
ELSE (D-—10,WC-—countdn WC), 
END 


In the description above, if the sum is normalized after the addition, no normal- 
ization is required. In this case, counter WC is reset to 0 during the first step of the 
normalization subsequence, and the floating-point addition and subtraction process 
is terminated at the last step. However, the normalization can be completed at the 
first, third, fifth, seventh, ninth, or eleventh step of an execution cycle. In order to 
have normalization terminated exactly at the last step of an execution cycle, counter 
WC and block WAIT are again used in the manner shown in description (5.12) for 
the characteristic alignment. 


5.5.3 Floating-point Multiplication 


The floating-point multiplication process has been described in the sequence 
charts in Figs. 5.9-5.11. The control signals for the floating-point multiplication 
sequence are FMP«K(0)«(1),..., and FMP*K(8)«P(1), except for its normalization 
part in which the auxiliary command NO is employed. 

The sequence chart for the initiation has been shown in Fig. 5.9. The control 
signals for the first six steps of this subsequence аге FMP«K(0)«P(1), ..., and 
FMP*«K(5)«P(1). The CDL description of the initiation is shown in the following: 


Comment, Floating-point multiplication (5.16) 


Comment, Initialization (5.17) 
/FMP*K(0)*P(1)/ READ<1, 

/ЕМРхК(1)*Р(0)/ N<0, T<-0, W—0, Y-—0, WC<0, 

/ЕМР*К(2)*Р(1)/  SC—27, AC(9Q,FR)—0, AC(Q,CH)—0-MQ(CH), 
/ЕМР*К(3)*Р(Г)/ SR—M(C) 


Sec. 5.5 CDL Description 199 


/ЕМР*К(4)*Р(1)/ ТЕ (MQ#S)=SR(S)) THEN (AC(S)<-0,MQ(S)<0) 

ELSE (AC(S).—1,MQ(S)—1), 

/FMP*K(5)*P(1)/ ТЕ (SR(FR)=0)+(MQ(FR)=0)) THEN (АС<-0, 

MQ-—0,Y —1) ELSE (T-——1,N-—1), 

/FMP«*K(11)«P(1) IF (У=1) THEN (Y —0,I——0,D-—15), 

In the description above, if no zero fraction is found in the multiplicand or in the 
multiplier during the sixth step, register T is set to 1 to initiate the characteristic 
addition subsequence and subsequently register М is set to 1 to initiate the fraction 
multiplication subsequence. If a zero fraction exists in the multiplicand or the multi- 
plier, register Y is set to 1 and floating-point multiplication sequence is terminated 
at the last step of the current execution cycle. 

The sequence chart for the characteristic addition has been shown in Fig. 5.10. 
The control signals for this subsequence are T+ FMP*K(6)«P(1),..., and Т«ЕМР 
*К(8)*Р(1). To enter this subsequence, register T is set to 1; to leave this subsequence, 
register T is reset to 0. The CDL description of the characteristic addition is shown 
below. 

Comment, characteristic addition (5.18) 

/TxFMP+#K(6)*P(1)/ AC(Q,CH)—AC(Q,CH) add 0-SR(CH), 

/T«*FMP*K(7)&«P(1 AC(Q,CH)—AC(Q,CH) add 200;, 

[/T«FMP«K(8)«P(1/ Т<0, IF (AC(Q)-1) THEN (CHOV —1,N-—0) 

ELSE (N—1), 

/FMP*K(ID«P(D/ IF ((N20)«T-—0) THEN (1—0,D—15), 

If a floating-point overflow occurs, the floating-point multiplication sequence is 
terminated at the last step of the current execution cycle. 

The sequence cycle for the fraction multiplication part has been shown in Fig. 
5.11. The control signals for this subsequence аге N*FMP«K(9)*P(1),..., and 
N*FMP*K(11)«P(1). Register М is set to 1 to start this subsequence and reset to 0 
to terminate this subsequence. The CDL description of the fraction multiplication 
is shown below. 

Comment, fraction multiplication (5.19) 

/NxFMP#K(9)*P(1)/ ТЕ (MQ(35)21) THEN (AC(9Q,FR)-——AC(9Q,FR) 

add O-SR(FR)), SC<-countdn SC, 

/N«FMP*K(10)*P(1)) AC(9Q,FR)-MQ(FR)-—shl AC(9Q,FR)-MQ(FR), 

ТЕ (SC40) THEN (D-—8) ELSE (WC-«—5), 

/[N*FMP*K(11)*P()) ТЕ (WC=0) THEN (N—0, OPCODE-—9,D —15) 

ELSE (D-—10,WC-—countdn WC), 


200 Chap.5 A PARALLEL, BINARY, FLOATING-POINT ARITHMETIC UNIT 


The above loop for addition and right shifting is controlled by setting register 
D to 8. There are 27 iterations of the loop for the 27 fraction bits. Thus, the particular 
step of the execution cycle at which the fraction multiplication is completed is found 
to occur at the second step. Therefore, during that step, counter WC is set to 5 so 
that, when counter WC reaches 0, it is at the eighth step of execution cycle and is 
ready for the normalization subsequence at the next step. 

The sequence chart for the normalization part has been shown in Fig. 5.7. The 
control signals for the normalization subsequence here are МО*К (9)*Р(1), ..., and 
NO«K(11)*P(1). The CDL description of the normalization for the floating-point 
multiplication process is shown below. 


Comment, normalization (5.20) 
/NOsK(9)«P(1)/ SC—27, 
IF (AC(9)—1) THEN (MQ-—0,WC-—0), 
IF (MQ(9)—1) THEN 
(AC(9Q, FR) ——AC(9Q,FR) add 1)), 
IF (AC(9)z21) THEN (AC(9Q,FR)-MQ(FR)-—shl 
AC(9Q,FR)-MQ(FR), 
AC(Q,CH)<countdn AC(Q,CH), 
SC<countdn SC,W<1), 
/W*NO*K(10)*«P(1/ IF (SC40)*(AC(Q)41)) THEN (D<8), 
IF (SC+40)*(SC(Q)=1)) THEN (CHUN<1,AC<0, 


MQ<0,W<0,DO WAIT), 
IF (SC=0) THEN (AC—0,MQ-—0,W-—0,WC-—8), 
/NO#(11)*P(1)/ IF (WC=0) THEN (1—0,D-—15) 


ELSE (D-—10,WC-—countdn WC), 
END 


The description above is identical to that of (5.15) except that the command here 
is NO and register N is not used. 


5.5.4 Floating-point Division 


The floating-point division process has been described in the sequence charts 
in Figs. 5.13-5.16. The control signals for the floating-point division sequence employ 
two auxiliary commands, CS and FD. 

The sequence chart for the initiation has been shown in Fig. 5.13. The control 
signals for the six steps of this subsequence аге FDV«K(0)«P(1), ..., and FDV 
*K(5)*P(1). The CDL description is shown below. 


Comment, floating-point division | (5.21) 


Sec. 5.5 CDL Description 201 


Comment, initialization (5.22) 
/FDV*K(0)*P(1)/ READ<1, 
/FDV*K(1)*P(1)/ N<0,T<—0,W<0,Y—0,WC<0, 
/FDV*K(2)*P(1)/ MQ<, 
/FDV«K(3P(1 SR<M(C), 
/FDV«K(4)«P(1/ IF (SR(S)=AC(S)) THEN MQ(S)—0) ELSE MQ(S)<-1, 
/FDV*K(5)*P(1)/ ТЕ (SR(FR)=0) THEN (DVOV —1Y —1) ELSE 

(IF (AC(FR)=0) THEN (MQ<0, AC—0,Y —1) 

ELSE (N<1)), 

/FDV«K(1D)«P(1/ IF (Y=1) THEN (Ү<-0,1--0,О--15), 


In the description above, if no zero fraction is found in the divisor or in the dividend 
during the sixth step, register М is set to 1 to initiate the dividend alignment subse- 
quence. If a zero fraction exists in the dividend or in the divisor, register W is set to 
1 and the floating-point division sequence is terminated at the last step of the current 
execution cycle. 

The dividend alignment has been shown in the sequence chart in Fig. 5.14. The 
control signals for this subsequence are N«FDV*K(6)*P(1), ... , NKFDV*K(8)«P(1), 
and N«FDV*K(11)«P(1). Register М is set to 1 to initiate this subsequence and is 
reset to 0 to terminate this subsequence. The CDL description of this subsequence is 
shown below. 


Comment, dividend alignment (5.23) 

/N*FDV+*K(6)*P(1)/ AC(9Q,FR)—AC(9Q,FR)’, 

/N*FDV*K(7)*P(1)/ IF (Z(Q)41) THEN (AC(9Q,FR)-MQ(FR) — 
1-AC(9Q,9-34)-AC(35) -M(9-34), 
AC(Q,CH)-—countup АС(О, CH), Те-1) 
ELSE (W —1), 

/T«N«FDV*K(8)«P(1/ T<—0, IF (AC(Q)=1) THEN (CHOV-—1,N-—0, Y —1;) 

ELSE (W ——1), 
/N*FDV*K(1D*P(D/ |. IF(W—1) THEN(N<0,W<0,OPCODE<— 10, D<—15), 


In the description above, if a floating-point overflow occurs, register N is reset to 
0 and register Y is set to 1 so that the floating-point division sequence will be termi- 
nated during the last step of the current execution cycle. Otherwise, the auxiliary 
command CS is generated to activate the characteristic subtraction subsequence. 

The characteristic subtraction has been shown in the sequence chart in Fig. 5.15. 
The control signals for this subsequence аге CS*K(0)*P(1), ... , and CS«K(7)«P(1). 
The CDL description is shown below. 


202 Chap.5 A PARALLEL, BINARY, FLOATING-POINT ARITHMETIC UNIT 


Comment, characteristic subtraction (5.24) 
/CS*K(0)«P(1)/ SR(Q,CH)<SR(Q,CH)’, 
/CS*K(1)*P(1)/ AC(Q,CH)-—AC(Q,CH) add 0-SR(CH), 
/CS*K(2)*P(1)/ IF (AC(Q)=1) THEN (AC(Q)-—0,N-—1) ELSE (W<1), 
IF (AC(Q,CH)=777,) 

THEN (AC(Q,CH)-—AC(Q, CH) add 200,) 

ELSE (CHUN--1,AC(CH)-—0,MQ(CH)-—0), 
/N*CS«K(3)«P(1/ AC(Q,CH)—AC(Q,CH) add 1, 
/N*CS«K(4)«P(1/ AC(Q,CH)—AC(Q,CH) add 200,, 
ГҸ*С5*К(5)*Р(1)/ MQ(CH)—AC(CH), 
/N«CS*K(6)«P(D/ | AC(Q,CH)—AC(Q,CH) add 745,, W-—1,N«—0, 
[/W*CS*K(7)*P(1/ SR(Q,CH)—SR(Q,CH),SC-—27,W ——0, 


In the description above, there are two subsequences. One of them which is controlled 
by register N is for the case when the difference is positive, and the other, which is 
described by the third execution statement in (5.24), is for the case when the difference 
is either zero or negative. This subsequence is terminated at the eighth step, and is 
followed immediately by the fraction division instead of parallel execution as described 
in Figs. 5.12, 5.15, and 5.16. 

The fraction division has been shown ш the sequence chart in Fig. 5.16. The 
control signals for this subsequence аге CS«K(8)«P(1), ..., and CS*K(11)*P(1). The 
CDL description is shown in the following: 


Comment, fraction division (5.25) 
/CS*K(8)#P(1)/ | AC(9Q,FR)-MQ(FR)-—AC(FR)-MQ(10—-35)-0, 
SC<—countdn SC, 
/СЗ*К (9)*Р(1)/ ТЕ (Z(Q)=1) THEN (MQ(35) —1, 
AC(9Q,FR)-—AC(9Q,FR) add 0-5Е(ЕК)), 
/CS*K(10)*P(1)/ ТЕ (SC+0) THEN (D<7) 
ELSE (AQ(9Q,FR)<AC(9Q, FR)’, WC<7), 
/CS#K(11)*PC)/ IF (WC=0) THEN (1—0,D—15) 
ELSE (D<10,WC<countdn WC), 
END 


In the description above, the loop for repeatedly shifting casregister AC(9Q,FR)- 
МО(ЕК) to the left, adding the divisor, and decrementing counter SC is achieved by 
setting register D to 7 so that the next clock P(2) sets register D to 8. After going 
through the loop 27 times, the fraction division is completed during the fourth step 


References 203 


of an execution cycle. In order to have this subsequence terminated exactly at the 
last step of that execution cycle, counter WC is set to 7. Counter WC is then decre- 
mented in the one-statement loop until it reaches 0. At that time, the floating-point 
division process is completed. 


References 


. Burks, A. W., GOLDsTINE, H. H., and Von NEUMANN, J., “Preliminary Discussion of 


the Logical Design of an Electronic Computing Instrument,” 1946. Reprinted in 
Datamation 8, No. 9, рр. 24-31, September, 1962, and 8, Мо. 10, pp. 36-41, October, 
1962. 


. Рнізтек, M., JR., Logical Design of Digital Computers. New York: John Wiley & Sons, 


Inc., 1958. 


. METROPOLIS, N., and ASHENHURST, R. L., “Significant Digit Computer Arithmetic,” 


IRE Transactions of Electronic Computers, December, 1958, рр. 265-267. 


. BLAAUW, С. A., “Indexing and Control-Word Technique,” JBM Journal of Research 


and Development 3, July, 1959, pp. 288-301. 


. Gary, Н. L., and Harrison, C, Jr., “Normalized Floating-point Arithmetic with an 


Index of Significance,” Proc. of the Eastern Joint Computer Conference, 1959, рр. 244—248. 


. ASHENHURST, R. L., and METROPOLIS, N., *Unnormalized Floating-point Arithmetic,” 


J. of the ACM, July, 1959, pp. 415-428. 


. CARR, JOHN W. III: “Error Analysis in Floating-point Arithmetic,” Comm. of the ACM, 


May, 1959, pp. 10-15. 


8. WADLY, W. G., “Floating-point Arithmetics,” J. of the ACM, April, 1960, pp. 129-139. 


. BECHMANN, Е. S., Brooks, F. P., JR., and LAwrEss W. J., JR., “Developments in the 


Logical Organization of Computer Arithmetic and Control Units,” Proc. of the IRE 
49, Мо. 1, рр. 53-66, January, 1961. 


. MacSor ey, О. L., “High-Speed Arithmetic in Binary Computers,” Proc. of the IRE 


49, No. 1, pp. 67-91, January, 1961. 


. BUCHHOLZ, W., Planning a Computer System. New York: McGraw-Hill Book Company, 


1962. 


. Cuu, Y., Digital Computer Design Fundamentals. New York: McGraw-Hill Book 


Company, 1962. 


. FLORES, I., The Logic of Computer Arithmetic. Englewood Cliffs, N. J.: Prentice-Hall, Inc., 


1963. 


. STERBENZ, P. H., “Floating-point Number Systems,” Notes, June, 1966. 


: “UBM 7094 Principles of Operation,” Form A22-6703-4, 5th Edition, October, 
21, 1966. 


204 


16. 
17. 
18. 


19. 


Chap.5 А PARALLEL, BINARY, FLOATING-POINT ARITHMETIC UNIT 


Fores, I., Computer Design. Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1967. 
GscHWIND, H. W., Design of Digital Computers. Springer-Verlag New York, Inc., 1967. 


Кмотн, О. E., The Art of Computer Programming Fundamental Algorithms, vol. 2. 
Reading, Massachusetts: Addison-Wesley Publishing Co., Inc., 1969. 


Сно, Y., Introduction to Computer Organization. Englewood Cliffs, N.J.: Prentice-Hall, 
Inc., 1970. 


Problems 


5.1 


5.2. 


5.3. 


5.4. 


5.5. 


5.6. 


. Write a procedural CDL description to describe the floating-point addition and 
subtraction shown by the sequence charts in Figs. 5.4.-5.7. 


Repeat Problem 5.1. to describe the floating-point multiplication shown by the 
sequence charts in Figs. 5.9-5.11. 


Repeat Problem 5.1 to describe the floating-point division shown by the sequence charts 
in Figs. 5.13-5.16. 


By using an algorithmic language such as Fortran, write a program to simulate and test 
the floating-point addition and subtraction described by the sequence charts in Figs. 
5.4-5.7. 


Repeat Problem 5.4 to test the floating-point multiplication described by the sequence 
charts in Figs. 5.9-5.11. 


Repeat Problem 5.4 to test the floating-point division described by sequence charts in 
Figs. 5.13 and 5.16. 


. Modify the parallel adder with group and section carries as described by terminal 
statement (4.37) so that it can also be used for the floating-point arithmetic unit. 


. By using an algorithmic language such as Fortran, simulate and test the parallel adder 
described in Problem 5.7. 


. What are the maximum number and the minimum number of execution cycles that are 
required for the floating-point addition and subtraction described in statements (5.10)- 
(5.15). 


. What are the maximum number and the minimum number of execution cycles that are 
required for the floating-point multiplication described in statements (5.16)-(5.20). 


. What are the maximum number and the minimum number of execution cycles that are 
required for the floating-point division described in statements (5.21)-(5.25). 


. Select a control memory, its control word format, and other computer elements. 
Prepare a microprogram for 
(a) the floating-point addition and subtraction sequence described by statements 
(5.10)-(5.15), у 
(Ы) the floating-point multiplication sequence described by statements (5.10)-(5.20), 
and 


Problems 205 


(с) the floating-point division sequence described by statements (5.21)-(5.25) so that 
these sequences are sequenced by the microprogram in the control memory. 


5.13. Block WAIT is defined in statement (5.13). Use the following micro-operation, 
WC<2 counti2 WC, 


to replace block WAIT in statements (5.12) and (5.15) where count12 is a special 
operator which counts in module of 12. 


An arithmetic unit is capable of performing addition, subtraction, multiplication, 
and division. These arithmetic operations can be performed in parallel, in series, or 
in their combination. A parallel arithmetic unit adds the digits of two numbers at 
the same time, but a serial arithmetic unit may perform addition of two numbers 
digit-by-digit with a simple adder. Thus, a parallel arithmetic unit takes a shorter 
time to perform addition operations, but a serial arithmetic unit costs less to build. 
This chapter deals with serial arithmetic units, while Chapters 5 and 6 describe 
parallel arithmetic units. 

In this chapter, a binary serial arithmetic unit and a decimal serial-digit arith- 
metic unit are described. A binary, serial arithmetic unit may perform addition on a 
single bit at a time, or on multiple bits (such as two bits) at a time. Thus, a single- 
bit adder and/or a subtracter, or a multiple-bit adder and/or a subtracter is needed. 
Similarly, the decimal arithmetic unit may perform addition on a single decimal- 
digit at a time or on multiple decimal-digits at a time; therefore, a single decimal- 
digit adder or a multiple decimal-digit adder is required. The binary arithmetic 
unit described in this chapter makes use of a single-bit adder-subtracter and the 
decimal arithmetic unit makes use of a single decimal-digit adder-subtracter. 


Serial Arithmetic Units 6 


6.1 Configuration of a Binary, Serial Arithmetic Unit 


The configuration of the binary, serial arithmetic unit begins with the choice 
of the binary number. The adder-subtracter is then specified, followed by the selection 
of computer elements. Selection of the elements for the control part, however, will 
be deferred until the timing and control signals of the unit are described. 


6.1.1 Number Representation 


The binary number chosen for the arithmetic unit has 24 bits. It is in the signed 
2’s complement representation (4). The format is shown in Fig. 6.1 where the binary 


Binary point 


Fig. 6.1 Format of the binary number 


point is located between the sign bit and the most significant number bit; thus, the 
number is fractional and has a fixed-point. 
When the number is positive, it is represented by the sign and the magnitude or, 


Х-0х24 55х27 (6.1) 


where X is the number, 0 represents the positive sign, and x,’s are the number bits. 
The largest number is 0.11... 1 or (2°—2-?3). When the number is negative, it is 
represented by the 2's complement of the negative number or, 


x-ixz-lox 2 4 $ x2) (6.2) 
бта 


where 1 represents the negative sign. Since the 275 complement of a number is equal 
to the sum of the 1’s complement of the number and a least significant number bit, 


208 


Sec. 6.1 Configuration of a Binary, Serial Arithmetic Unit 209 


the above representation may also be written as, 


X=1 294 Y x23 4 27 (6.3) 
ігі 
апа 
х-1-х, 


where x; is the 1’s complement of x, The smallest negative number is 1.00... 0 
or —1. 


2 


6.1.2 Full Adder-subtracter 


A single-bit full adder is a logic network which has three inputs and two outputs. 
Let the inputs be X, Y and W, which represent the augend, the addend, and the 
input carry, respectively. Let the outputs be Z and W, which represent the sum and 
the output carry, respectively. The single-bit full adder can be described by the follow- 
ing terminal statement; 


Terminal, Z=X@YOw,, 


W,—X*Y --Y*W,--W*X, Pm 
A single-bit full subtracter is a logic network which may also have three inputs and 
three outputs. Let the inputs and the outputs be represented by X, Y, W,, Z, and W,. 
The single-bit full subtracter can be described by the following terminal statement: 


Terminal, 2=ХФҮФҰ,, 


(6.5) 
W,—X'*Y -- Y«W,--W;«X', 

As shown above, output Z is the same in both terminal statements, though W; denotes 
a carry in statement (6.4), but a borrow in statement (6.5). Output W, is also identical 
except that input X, which is the minuend bit, is complemented in statement (6.5). 

The above adder and subtracter can be combined into one logic network. Let 
the single-bit register N indicate an addition when it contains a 1 and a subtraction 
when it contains a 0. The single-bit full adder-subtracter can be described by the 


following terminal statement: 


Terminal, Z—XQYOW, 


| (6.6) 
W,=(N@X)*¥ +(N@X)#W, + YEW, 


When register N contains a 1, terminal statement (6.6) becomes statement (6.4); 
when register N contains a 0, it becomes statement (6.5). The adder-subtracter 
described by statement (6.6) is the one to be used in the binary, serial arithmetic 


unit. 


210 Chap.6 SERIAL ARITHMETIC UNITS 


6.1.3 Configuration 


The computer elements selected for the binary, serial arithmetic unit excluding 
the control part are shown in the block diagram in Fig. 6.2, where FAS denotes the 
single-bit full adder-subtracter. As shown, register A is the accumulator, register Q 
is the multiplier-quotient register, and register R is another operand register which 


SUM OV 


ВС(4-0) 
М/С(4-0) 
DSTEST DV . 


Fig. 6.2 Configuration of the binary, serial arithmetic unit 
(excluding the control part) : 


Sec. 6.1 Configuration of a Binary, Serial Arithmetic Unit 211 


also serves as the buffer register of the memory. In conjunction with the previously 
described adder-subtracter, arithmetic operations are carried out in these three 
registers. Table 6.1 shows the functions of these three registers during addition, 
subtraction, multiplication, and division; these functions will be further explained. 


TABLE 6.1 Functions of the Registers During Arithmetic Operations 


REGISTER A REGISTER Q 
OPERATION REGISTER R AT THE START Ат THE END AT THE START Ат THE END 
Addition Addend Augend Sum Unused Unused 
Subtraction Subtrahend Minuend Difference Unused Unused 
Multiplication Multiplicant Zeros Product Multiplier Product? 
Division Divisor Dividend Remainder Zeros Quotient 


+The more significant half of the product 
{The less significant half of the product 


There are two 4-bit counters, BC and WC. There are eleven single-bit registers, 
SR, SA, OV, AD, DV, SUM, DIF, М, С, Е, and DSTEST. The sign bits А(0) of 
register A and R(0) of register В are stored in registers SA and SR, respectively. 
Register OV indicates a specific carry or borrow during addition or subtraction. 
Register AV indicates overflow during addition while register DV indicates overflow 
during division. Registers SUM, DIF, and DSTEST are employed to initiate the 
add subsequence, the subtract subsequence, and the divide-stop test subsequence, 
respectively. Register N performs add-subtract control. Register C stores a carry or 
a borrow. Register E is a temporary register used in multiplication and division. 

The above configuration is now described by the following declaration state- 
ments: 


Comment, configuration of a binary, serial arithmetic unit (6.7) 
Register, R(0-23),  S$buffer register 

A(0-23) | $accumulator 

A(0-23), $multiplier-quotient register 

BC(4-0) — $bit counter 

WC(4-0), Sword counter 


E; $temporary register 

C, $carry register 

AV, $add overflow indicator 

DV, $divide overflow indicator 

N, $add-sub control register (add when 1) 


SUM, $control register for add subsequence 


212 Chap. 6 SERIAL ARITHMETIC UNITS 


DIF, $control register for subtract subsequence 


DSTEST, $control register for divide-stop test subseq. 


ОУ, $temporary register for carry or borrow 
SA, $store sign A(0) 
SR, $store sign R(0) 
Casregister, АОЕ(0-48)--А-О-Е 
AQ(0-47)— A-Q 
Comment, description of the single-bit adder-subtracter 
Terminal, 7--К(23)А(23)6, $sum terminal 


W=(N@A(23))#R(23)+(N@A(23))*#C-+ R(23)*C, $carry or bor- 
row terminal 
Comment, terminal for overflow test during addition or subtraction 
Terminal, AVTEST=N«#SA’«SR’*C+N*SA*SR*C’+N’#SA‘#*SR#C’ 
+N’«xSA#SR‘*C, 
Comment, terminal for divide-stop test during division 
Terminal, | DVSTOP-N'*A(0)'*R(O)'*SA«E'--N'*A(0)*R(0O)*SA" 
-+NxA(0)’*R(0)*SA*E’-++ N*«A(0)*R(O)'*SA', 
In the description above, casregister AQE is declared for use in multiplication, and 
casregister AQ for use in division. The first terminal statement defines the single-bit 
full adder-subtracter where the inputs are A(23), R(23), and C, and the outputs are 
Z and W. The second terminal statement describes the terminal for testing overflow 
during addition or subtraction. The third terminal statement describes the terminal 


for testing divide-stop condition during division. The determination of these condi- 
tions will be explained later. 


6.2 Binary Addition and Subtraction 


Binary addition and subtraction in the serial arithmetic unit employ the direct 
addition and subtraction algorithm. The algorithm, the overflow conditions, the 
configuration, and the sequence chart are now presented. 


6.2.1 Algorithm 
The algorithm for addition and subtraction of binary numbers in the signed 2's 


complement representation (4) is shown in the flowchart in Fig. 6.3. As shown, it 
adds the addend to the augend in the case of addition, and subtracts the subtrahend 


Sec. 6.2 Binary Addition and Subtraction 213 


Add entry Subtract entry 


Different 
sign 


Overflow 
determination 


Exit 


Fig. 6.3 Flowchart showing addition and sybtraction for binary 
numbers in the signed 2's complement representation 


from the minuend in the case of subtraction. In either addition or subtraction, the 
sign bit is treated as a number bit. When the signs of the two numbers are the same, 
as ш the case of addition, or are different, as in the case of subtraction, an overflow 
may occur; it is then tested. The addition or subtraction is now completed. 


6.2.2 Overflow Conditions 


When two 23-bit signed binary numbers with the same sign are added to each 
other, or when two similar numbers with different signs are subtracted from each 
other, the result may exceed 23 number bits. When this condition occurs, it is called 
overflow. When an overflow occurs, the sum or difference becomes incorrect. 

Overflow conditions that occur during addition or subtraction of binary numbers 
in the signed 2's complement representation are shown in Table 6.2. As shown, while 
two binary numbers are added, an overflow occurs if a carry appears from the most 
significant number bit when the numbers are both positive, or if no carry appears 
from the most significant number bit when the numbers are both negative. When 
two binary numbers are subtracted from each other, an overflow occurs if a borrow 
appears from the most significant number bit when the minuend is negative and the 


214 Chap.6 SERIAL ARITHMETIC UNITS 


TABLE 6.2 Overflow Conditions for Addition and Subtraction 
(for binary numbers in the signed 2’s complement representation) 


SIGN BITS 
ADDITION оо 
CASE SUBTRACTION - А(0) R(O) OVERFLOW CONDITIONt 
A Addition 0 0 Carry generated from the msb 
B Addition 1 1 No carry generated from the msb 
С Subtraction 0 1 No borrow generated by the msb 
D Subtraction 1 0 Borrow generated by the msb 


tmsb = most significant bit 


subtrahend is positive, or if no borrow appears from the most significant number 
bit when the minuend is positive and the subtrahend is negative. 

If the overflow conditions above are incorporated into the flowchart of Fig. 
6.3, the chart becomes the same as the one shown in Fig. 6.4. In this figure, Z is the 
sum or the difference, X is the augend or the minuend, Y is the addend or the sub- 
trahend, and x, and y, represent the signs of X and Y, respectively. The carry or 
borrow refers to that from the most significant number bit. 

In the serial arithmetic unit, the signs of the two numbers are stored in registers 
SA and SR. The carry or borrow is stored in register C. Addition or subtraction is 
indicated by register N. Thus, the above-stated overflow conditions lead to the ter- 
minal statement which defines terminal AVTEST in statement (6.7) for testing over- 
flow during addition and subtraction. 


6.2.3 Configuration 


For addition and subtraction, registers A and R are used. As shown in Table 
6.1, the augend or the minuend is initially stored in register A, and the addend or 
subtrahend is initially stored in register R. After the addition or subtraction, the sum 
or the difference is stored in register A and the augend or minuend is lost. 

Register OV indicates the presence or absence of the carry or borrow from the 
most significant bit according to the conditions in Table 6.2. This indicates overflow 
in the case of addition and subtraction. When an overflow occurs, register AV is 
set to 1 to indicate the overflow. However, the presence or absence of this carry or 
borrow does not indicate overflow in the case of multiplication and division because 
the addition and subtraction are also employed in multiplication and division. 
Register AV, in this case, should not be set to 1 because overflow does not occur. 


6.2.4 Sequence Charts 


The addition and subtraction sequence is shown in the sequence chart in Fig. 
6.5. The sequence chart in Fig. 6.6 shows the SUM-DIF subsequence which is 


Sec. 6.2 Binary Addition and Subtraction 


Add entry 


Subtract entry 


| 


uu or 
borrow 
Carry or 
borrow 
Overflow 
indication 


Exit 


Fig. 6.4 Flowchart showing addition and subtraction for binary 
numbers in the signed 2's complement representation 


215 


"called" during the addition and subtraction sequence when register SUM or DIF is 


set to 1 (Fig. 6.5). 


As shown in Fig. 6.6, the SUM-DIF subsequence is examining registers SUM 
and DIF in the waiting loops. When register SUM or DIF is found to contain a 1, 
the subsequence begins by setting register М to 1 or 0, respectively. In either case, 
the subsequence is then initialized by resetting registers BC, C, and OV to 0, and by 


216 Chap.6 SERIAL ARITHMETIC UNITS 


ADD entry SUB entry 


Exit 


Fig. 6.5 Sequence chart for the binary addition and subtraction 
sequence 


storing sign bit A(0) to register SA and sign bit R(0) to register SR. Now, the loop 
for the bit addition begins. Outputs Z and W from the single-bit adder-subtracter 
are first transferred to bits A(23) and C, respectively. Registers À and R are then 
circularly shifted to the right one bit position, and counter BC is then incremented 
by 1. When counter BC reaches 23, terminal AVTEST is tested for any possible 
overflow. If an overflow occurs, register OV is set to 1. Counter BC is next tested for 
the value of 24. If it does not contain 24, the subsequence returns to the step of stor- 
ing the outputs from the adder-subtracter. This bit-addition loop continues on until 
counter BC reaches 24. At this time, register SUM or DIF is reset to 0 if register М 
is 1 or 0, respectively. The execution of the SUM-DIF subsequence is now completed, 
and the subsequence returns to the waiting loop. 

In Fig. 6.6, the addition sequence begins by setting register SUM to 1 and the 
subtraction sequence begins by setting register DIF to 1 in order to initiate the SUM- 
DIF subsequence; either then waits for its completion. The subsequence is completed 
when register SUM or DIF contains a 0. The addition and subtraction sequence 
then continues on to test whether register OV contains а 1. If it contains a 1, register 
AV is set to 1 to indicate an addition overflow. At this time, the addition or subtrac- 
tion is completed. 


А(23)-2, 
CW, 


Accir A, 
R<cir В, 
BC<countup BC 


Fig. 6.6 Sequence chart for the SUM-DIF subsequence 


217 


218 Chap.6 SERIAL ARITHMETIC UNITS 


6.3 Binary Multiplication 


Binary multiplication in this serial arithmetic unit makes use of Booth’s 
algorithm. The algorithm, the configuration, and the sequence chart are now 
described. 


6.3.1 Algorithm 


Booth’s algorithm (4) is shown in the flowchart in Fig. 6.7 where X is the multi- 
plicand, P is the partial product, i is the index and n is 23. Multiplier Y is, 


Multiple entry 


У;У+1=10 


Exit 


Fig. 6.7 Flowchart showing multiplication by Booth's algorithm 
for binary numbers in the signed 2's complement repre- 
sentation 


Sec. 6.3 Binary Multiplication 219 


Y=y,X2,+ > у.2-: (6.8) 


As shown in Fig. 6.7, this algorithm examines two neighboring multiplier bits, y; 
and y;,,. When these two bits are 10, the new partial product is formed by subtract- 
ing multiplicand X from the current partial product, or 


P—P-—X 


When they are 01, the new partial product P is formed by adding multiplicand X 
to the current partial product, or 


P<P+X 


When they are 00 or 11, the partial product P is neither added nor subtracted. In any 
of the three cases, the partial product P is then divided by 2, and index i is decremented 
by 1. Two other adjacent bits are examined and another new partial product is formed. 
This process continues until i reaches 0. At this time, the multiplication is completed. 

When the multiplication begins, only one bit, y,, is examined. If it is a 1, the first 
partial product P is formed by subtracting multiplicand X from the initial partial 
product, as shown in Fig. 6.7; otherwise, there is neither addition nor subtraction. 
As also shown in Fig. 6.7, there is no division by 2 after the last partial product (i.e., 
the product) is formed. 


6.3.2 Configuration 


Binary multiplication by Booth’s algorithm makes use of registers A,R,Q, and 
E. As shown in Table 6.1, the multiplicand and the multiplier are initially stored in 
registers R and Q, respectively, and registers A and E are both initially 0. During 
multiplication, registers A, Q, and E form a casregister for right shifting. After 
multiplication, the product is stored in registers A and Q with the more significant 
half of the product in register A, and the less significant half in register Q. 


6.3.3 Sequence Chart 


The multiplication sequence is shown in the sequence chart in Fig. 6.8. Since 
addition and subtraction are required in the multiplication, (һе SUM-DIF sub- 
sequence in Fig. 6.6 is called to perform addition and subtraction in the multiplica- 
tion sequence. 

As shown in Fig. 6.8, registers E, A, and WC are first reset to 0. The repeated- 
addition loop begins by examining bits Q(23) and E. If they are 01 or 10, an addition 
or a subtraction is carried out, respectively. (For the first time, only 00 or 10 can 
occur because register E is reset to 0.) In all cases, register OV is then reset to 0 and 
counter WC is incremented by 1. Counter WC is next examined to see whether it 
contains the value of 24. If it does not, casregister AQE is shifted to the right one bit 
position and then it returns to examine bits Q(23) and E. This process continues 


220 Chap.6 SERIAL ARITHMETIC UNITS 


MPY entry 


10 


OV-0, 
WC<countup WC, 


AQE(1-48)-AQ 


0<А(0)-0(0-22) 


Exit 


Fig. 6.8 Sequence chart for the binary multiplication sequence 


until counter WC reaches 24. At that time, a shift-right micro-operation is performed 
in register Q in order that the less significant half of the product can be stored in 
subregister О(1-23). The multiplication is now completed. 

In Fig. 6.8, when the SUM subsequence is called to perform an addition, or 
when the DIF subsequence is called to perform a subtraction, the multiplication 
sequence waits for the completion of the subsequence by continuously examining 
register SUM or DIF. It should be noted that the execution of.this subsequence and 
the examination of register SUM or DIF are performed simultaneously. This simul- 
taneous operation also occurs when (һе SUM-DIF subsequence is called for in the 
multiplication sequence and in the division sequence. . 


Sec. 6.4 Binary Division 221 


6.4 Binary Division 


Binary division utilizes the nonrestoring algorithm developed by Burks, 
Goldstine, and von Neumann (1). The algorithm, the configuration, the divide-stop 
condition, and the sequence charts are now presented. 


6.4.1 Algorithm 


The nonrestoring algorithm has been described in more detail elsewhere (4). 
Briefly, it forms a partial remainder by adding or subtracting the divisor each time 
a quotient bit (or rather a pseudo quotient bit) is generated. To be specific, let K and 
Y be the dividend and the divisor, respectively. The partial remainder is formed by 
the following equation, 


г, = 2r,_, + (1 — 2q)Y (6.9) 


in addition to the rule that, if the signs of remainder r,_, (not г) and divisor Y 
are the same, the quotient bit q; is 1 and the partial remainder r, is obtained by sub- 
tracting divisor Y from 2r, ,. If these two signs are different, the quotient bit q; is 
0 and the partial remainder r; is obtained by adding divisor Y to 2r; |. Quotient Q 
is formed by assembling the quotient bits q;'s as below, 


Q= (—1 + 27) + Xg2-t-» (6.10) 
ігі 


where (—1 + 2^?) is a correction added to the quotient bits. (This explains the reason 
why qs are called pseudo quotient bits.) 

Figure 6.9 shows the flowchart of the nonrestoring algorithm. As shown, the 
initial remainder is dividend X. There is a loop in which the signs of remainder г; , 
and divisor Y are tested, quotient bit q; is generated, quotient Q is assembled, a new 
remainder is formed, and index 1 is incremented and then tested for exit from the 
loop. After the exit, the correction term is added to quotient Q to give the final 
quotient. 


6.4.2 Divide-stop Condition 


If the divisor is too small with respect to the dividend, the quotient becomes too 
large to be held in register Q. When such a situation arises, a division overflow occurs; 
the quotient is incorrect and the division should be stopped. 

As shown previously, the dividend and the divisor are chosen to be fractional. 
It is highly desirable that the quotient also be chosen fractional. This choice leads 
to the criterion that the dividend must be smaller than the divisor from which the 
divide-stop condition can be established. There are four cases in determining the divide- 
stop condition. They are shown in Table 6.3. These cases depend on the signs of the 


222 Chap.6 SERIAL ARITHMETIC UNITS 


Divide entry 


Exit 


Fig. 6.9 Flowchart showing division by nonrestoring algorithm 
for binary numbers in the signed 2's complement repre- 
sentation 


dividend and the divisor. When the two signs.are the same, à subtraction (cases a and 
b) is performed to test whether the dividend is smaller than the divisor; otherwise, 
an addition (cases c and d) is required for the test. In the case of a subtraction, the 
divisor is subtracted from the dividend. The four cases are: 


Sec. 6.4 Binary Division 223 


(a) When both signs are positive, division overflow occurs if the difference is positive or 
zero. It is found by noting that the sign of the difference is positive. 


(b) When both signs are negative (the numbers are now in the signed 2’s complement 
representation), division overflow occurs if the difference is negative or the magnitude 
of the difference is 0. It is found by noting that the sign of the difference is negative 
or the magnitude of the difference is 0. 


(c) When the sign of the dividend is positive but that of the divisor is negative, division 
overflow occurs if the sum is positive or 0. It is found by noting that the sign of the 
sum is positive. 

(d) When the sign of the dividend is negative but that of the divisor is positive, division 
overflow occurs if the sum is negative ог 0. It is found by noting that the sign of the 
sum is negative or the magnitude of the sum is O. 


The third terminal statement (6.7) defines terminal DVSTOP which embodies 
these four divide-stop test conditions in Table 6.3. This terminal statement is obtained 


TABLE 6.3 Divide-stop Test Conditions 
(for binary numbers in the signed 2's complement representation) 


SIGN BITS 
----------- OPERATION DIVISION OVERFLOW OCCURS Test CONDITIONt 
Case А(0) R(O) 
a 0 0 Subtract Difference is positive or 0 Sign is + 
b 1 1 Subtract Difference is negative or 0 Sign is — or magnitude is 0 
с 0 1 Add Sum is positive or 0 Sign is + 
d 1 0 Add Sum is negative or 0 Sign is — or magnitude is 0 


+Sign and magnitude refer to those of the sum or difference. 


as follows. The required addition or subtraction is represented by N or N', re- 
spectively, as register N indicates addition or subtraction. The signs of the dividend 
and the divisor are indicated by bits A(0) and R(0), respectively. Whether the sign of 
the sum or the difference is positive or negative is indicated by SA' or SA, respec- 
tively, as register SA stores the sign of the sum or the difference. Whether the magnitude 
of the sum or the difference is 0 or not is indicated in register E, as will be shown. 
Thus, the divide-stop conditions for cases (a) and (c) in Table 6.1 can be represented 
by the following Boolean expressions: 


N’«A(0)*R(0)#SA’ and N*A(0)&R(0)*SA' 


The divide-stop conditions for cases (b) and (d) can be represented by the following 
Boolean expressions 


N’*A(0)'*R(1YSA#E’ and N*A(0)*R(0)«SA«E' 


Terminal DVSTOP is merely the logical OR of the Boolean expressions above. 


224 Chap.6 SERIAL ARITHMETIC UNITS 


6.4.3 Configuration 


Binary division by nonrestoring algorithm makes use of registers A, R, and Q. 
As shown in Table 6.1, the dividend and the divisor are initially stored in registers 
A and R, respectively, and register Q is initially reset to 0. At the completion of the 
division, the quotient appears in register Q and the remainder in register A. The 
divisor in register R remains unchanged, while the dividend is lost. If a division over- 
flow occurs, it is indicated by register DV with its contents being 1. As mentioned 
previously, registers SA and E indicate the sign and zero-magnitude during the divide- 
stop test. 


6.4.4 Sequence Charts 


The sequence chart for the division sequence is shown in Fig. 6.10. It consists 
of three parts: initialization, divide-overflow test, and division. As shown, the initial- 
ization part merely resets registers DV, О, and WC to 0. Then, the DSTEST (і.е., 
divide-stop test) subsequence is “called.” After returning to the division sequence 
from the DSTEST subsequence, the division part follows. 

The sequence chart for the divide-stop subsequence is shown in Fig. 6.11. As 
shown, this subsequence is in a waiting loop and is continuously testing register 
DSTEST. When it finds that register DSTEST contains a 1, the DSTEST subsequence 
begins. It first resets registers BC, C, and E to 0, and sets register N to 1 (if addition) 
or 0 (if subtraction). Then it begins the add-subtract loop during which an addition 
(or a subtraction) is serially performed. During this addition (or subtraction), the 
carry bit (or borrow bit) is stored in register C, but the sum bit (or the difference bit) 
is ignored because it is of no use except for the bit from the leftmost bit position 
which is stored in register SA for use by terminal DVSTOP. Furthermore, during 
the addition (or subtraction), the logical OR of the sum bit (or the difference bit) 
with the contents of register E is stored in register E; in this way, when register 
E contains a 0, it indicates that the magnitude of the sum (or difference) is 0. The 
loop is iterated 24 times. After the exit from the loop, terminal DVSTOP is tested 
for division overflow. If it occurs, register DV is set to 1. At this time, the subse- 
quence is completed, and the register DSTEST is reset to 0 to return to the division 
sequence. 

After the return, the division sequence begins by testing register DV. If it contains 
a 1, the division sequence is terminated. Otherwise, it begins the divide loop. In this 
loop, six functions are performed, namely, setting bit Q(23) to 1 or 0, leftshifting 
casregister AQ, calling the SUM-DIF subsequence, resetting register OV to 0, incre- 
menting register WC, and testing register WC. If register WC does not contain 23, 
the loop is iterated until register WC reaches 23. Then, register Q is shifted one bit 
position to the left, and the correction to the quotient bits is made. This correction 
consists of complementing bit О(0) and setting bit Q(23) to 1. The division sequence 
is terminated. . 


DIV entry 


DSTEST<1 
DSTEST-0 zz] 
= 


* 
A(0)=R (0) 


Q(23)<1 
AQ-shl АО 


Ov<-O, 
WC<countup WC, 


Q<shl Q 
Q(0)<Q(0)’, 
Q(23)<1, 


Fig. 6.10 Sequence chart for the binary division sequence 


225 


226 Chap.6 SERIAL ARITHMETIC UNITS 


DSTEST=1 


N<A(0)0R (0), 
ВС<0, 

С<0, 

Е<0, 


А<сіг А, 
Бесіг В, 
BC<countup ВС, 


DSTEST-0 


Fig. 6.11 Sequence chart for the divide-stop test subsequence 


6.5 Statement Descriptions 


As the configuration and sequential operations of the addition, subtraction, 
multiplication, and division have been presented, the control part and the arithmetic 
sequences of the binary, serial arithmetic unit are now described by the CDL state- 
ments. 


Sec. 6.5 Statement Descriptions 227 


6.5.1 Control Configuration 


The control part of the binary, serial arithmetic unit makes use of five registers: 
SC, SUBC, F, G, and I. Register SC together with a decoder generates control 
signals K’s for sequencing the four arithmetic sequences. Register SUBC together 
with a decoder generates control signals J's for sequencing the SUM-DIF and 
DSTEST subsequences. Register I indicates a fetch cycle (when 1) and an execution 
cycle (when 0). Register F stores the op-code. Register F together with a decoder 
and register I generates the signals for commanding the arithmetic sequences and a 
combined signal for commanding those micro-operations in both addition and sub- 
traction. There is a two-phase clock, CP, from which two two-phase clocks, P and 
Y, are generated. When Register G is 1, clock P occurs; otherwise, clock Y appears. 
Thus, clock P can be switched to clock Y by resetting register G to 0 and vice versa. 
Clock P is used for the four arithmetic sequences, while clock Y for the SUM-DIF 
and DSTEST subsequences. In addition, there is switch START for initializing the 
sequences. 

The configuration above for the control part is shown in the block diagram of 
Fig. 6.12 and is described by the following CDL declaration statements: 


P(1) 

Logic P(2) Gran 
network Y(1) 
Y(2) 


Fig. 6.12 Configuration of the control part of the binary, serial 
arithmetic unit 


228 


Comment, control configuration of the binary, serial arithmetic unit 


Chap.6 SERIAL ARITHMETIC UNITS 


(6.11) 


Register, 5С(2-0), $sequence counter 
SUBC(2-0), $subsequence counter 
G, $clock control register 
F(1-2), $op-code register 
I, $fetch (when 1) and execution (when 0) 
Decoder, М(0-3)--Е, $op-code decoder 
К(0-7)=5С, $sequence decoder 
J(1-4) - SUBC, $subsequence decoder 
Terminal, SD=SUM-+DIF, $add and subtract combined command 
ADD-—M(0)«I', $addition command 
SUM=M(l)IV’, $subtraction command 
MPY=M(2)«I’, $multiplication command 
DIV=M(3)#I’, $division command 
Clock, СР(1-2), $two-phase clock 
Terminal, P(I-2)—CP(1—2)*G, $clock for sequences 
Ү(І-2)--СР(1-2жС”, $clock for subsequence 
Switch, START(ON), 


6.5.2 Addition and Subtraction Sequences 


Initially, the op-code is in register F and the operands are in registers R, A, 
and Q. Register SC is incremented by one every clock P(2). Register SUBC is incre- 
merited every clock Y(2). When the START switch is turned to the ON position, 
register SC and J are reset to 0 and register G is set to 1. These micro-operations are 
described by the following execution statements: 


Comment, here begins the arithmetic sequences (6.12) 


/START(ON)/ б<-1, SC<0, 1—0, 
/P(2)/ SC<—countup SC, 
/(SC--DSTEST)*Y(27 SUBC<countup SUBC, 


From the sequence chart for the addition and subtraction sequences in Fig. 
6.5, the addition and subtraction sequences are described below: 


Comment, here begins the addition sequence 
/ADD*K(1)#P(1)/ SUM<1, G—0, AV<0, 


(6.13) 


Sec. 6.5 Statement Descriptions 229 


Comment, the sequence is now waiting for completion of the SUM 
subsequence 


/ADD*K(2)*P(1)/ IF (OV=1) THEN (AV<1), 

/ADD*K(3*P(1/ I1, SC—7, 

Comment, here begins the subtraction sequence (6.14) 
[SUB*K(D*P(D/ DIF<1, G—0, SUBC—0, AV-—0, 


Comment, the sequence is now waiting for completion of the DIF 
subsequence 


/[SUB*K(Q)*P(1) IF (OV=1) THEN (AV —1), 

/[SUB*K(3«P(1/ I-—1,SC—7, 

In the statements above, the SUM-DIF subsequence 15 called by setting register 
SUM or DIF to 1 and, in order to start the subsequence with the required control 


signals, by resetting registers G and SUBC to 0. From the sequence chart in Fig. 
6.6, this subsequence is described below: 


Comment, here begins the SUM-DIF subsequence (6.15) 
/[SUMsJ()*Y(1/ N-—1, 
[/DIF*J(2)*Y( №0, 
/SD*J(3)* Y(1)/ ВС<-0, C—0, ЗА<А(0), SR——R(0), OV —0, 
/SD*J(4)«Y(1)/ A<Z-A(0-22), CW, Ке-сіі R, BC<-countup BC, 
/SD*J(5)* Y(D)/ IF ((BC—23)«(AVTEST =1)) THEN (OV —1), 
IF (BC424) THEN (SUBC—2) 
ELSE (IF (N—1) THEN (SUM —0) 
ELSE (DIF<0), G—1), 


6.5.3 Multiplication Sequence 


From the sequence chart for the multiplication sequence in Fig. 6.8, the multi- 
plication sequence is described below. 


Comment, here begins the multiplication sequence (6.16) 
/[МРҮ*К(1)*Р()/ Е<0, A—0, WC—0, 
/MPY*K(2)*P(1)/ IF (Q(23)’*E) THEN (SUM-1), 

IF (Q(23)*E’) THEN (DIF-- 1), 

IF (О(23) ФЕ) THEN (SUBC-»0, G—0), 


Comment, the sequence is now waiting for the completion of the SUM 
or DIF subsequence 


230 Chap.6 SERIAL ARITHMETIC UNITS 


/МРҮ+*К(3)*Р()/ OV<—0, WC<countup WC, 
/МРУ*К(4)*Р(1)/ ТЕ (WC=24) THEN (Q—A(0}-Q(0-22)) 

ELSE (АОЕ(1-48--АО,5С<-0)), 
/MPY#K(5)*P(1)/ Il, SC—7, 


6.5.4 Division Sequence 


From the sequence chart for the division sequence in Fig. 6.10, the division 
sequence is described in the following: 


Comment, here begins the division sequence (6.17) 
/DIV*K(1)#P(1)/ Q0, WC—0, DV——0, DSTEST—1, SUBC<0, 


Comment, the sequence is waiting for the completion of the DSTEST subse- 
quence 


/DIV*K(2)*P(1)/ ТЕ (DV=1) THEN (SC<6), 

/DIV*K(3)*P(1)/ IF (А(0)=В(0)) THEN (Q(23) —1) ELSE (Q(23)<0), 

/DIV*K(4)#P(1)/ IF (A(0)— R(0) THEN (DIF—1) ELSE (SUM<-1), 
AQ-—shl AQ, G-—0, SUBC-—0, 


Comment, the sequence is waiting for the completion of the SUM or DIF sub- 
sequence 


/DIV*K(S)«P(1) ОУ<0, WC<countup WC, 
/DIV*K(6)#P(1)/ IF (WC423) THEN (SC—1) ELSE (Q<shl Q), 
/DIV*K(7)*P(1)/ Q(0)—Q(0), Q23)—1, I—1, 
The description of the SUM-DIF subsequence has been shown above. From 
the sequence chart in Fig. 6.11, the DSTEST subsequence is described as follows: 
Comment, here begins the DSTEST subsequence (6.18) 
[/DSTEST*J(2)*Y(1) М<—А(0)ФВ(0), BC—0, C—0, 
/DSTEST*J(3)*Y(1) C«—W, SA—Z, A«cir A, Ке-сіг В, BC-—countup BC, 
/DSTEST*J(4)*Y(1)/ IF (BCz24) THEN (SUBC<1) 
ELSE (IF (DVSTOP=1) THEN (DV —1, 
DSTEST —0, С 1), 


6.6 Organization of a Decimal Arithmetic Unit 


A decimal arithmetic unit similar to the binary, serial arithmetic unit can be 
organized. In the following, the number presentation, the mode of operation, the 


Sec. 6.6 Organization of a Decimal Arithmetic Unit 231 


decimal-digit adder-subtracter, the configuration, and the arithmetic sequences of a 
decimal arithmetic unit are presented. 


6.6.1 Binary Coded Decimal Numbers 


Decimal numbers are commonly represented by those whose decimal digits are 
coded in binary numbers. For example, the decimal digit may be represented by the 
8—4-2-1 code, the 2-4-1-2 code, the 5-1-1-1-1 code, the excess-3 code, or the 2- 
out-of-5 code. The decimal numbers in any of such representations are called the 
binary-coded decimal numbers (or simply the BCD numbers). A more detailed discus- 
sion of the BCD numbers is presented elsewhere (4). For the decimal arithmetic unit 
to be presented here, the 8-4-2-1 code is chosen. This code makes use of the first 
10 binary numbers to represent the 10 digit values of 0, 1, 2, .. .9, respectively, 
while the remaining six binary numbers are not used. 

A decimal number can be signed or unsigned. A signed decimal number has a 
sign bit. The sign bit, usually located at the most significant digit position of the num- 
ber, is chosen here with the value of 0 or 1 as positive or negative sign, respectively. 
There are three representations of the signed binary numbers: 


1. The signed magnitude representation, 
2. The signed 10’s complement representation, 


3. The signed 9’s complement representation. 


When the decimal number is positive, the number digits for the three represen- 
tations above all represent the magnitude of the decimal number. When it is negative, 
the number digits represent differently. In the signed magnitude representation, the 
number digits still represent the magnitude, while in the other two representations, 
the number digits represent the 10’s complement or the 9’s complement of the magni- 
tude. These representations are similar to the three representations of a signed binary 
number. 

Similar to the signed 2’s complement representation of a binary number, the 
signed 10’s complement representation of a decimal number gives the actual value 
of the decimal number, if the value of the sign digit is regarded as negative. For 
example, consider the decimal number 1,523 (or —477) in the signed 10’s complement 
representation. If the value of the sign digit is regarded to Бе — 1000 instead of + 1000, 
then we have 


1,523 = —1 x 103+ 5х 102 +2 x 10! +3 x 10° 
= —1000 + 523 
= —477 
where —477 is the actual value of 1,523. То regard the value of the sign digit as 


negative does not effect the positive decimal number, because when the sign digit 


is positive, the value of the sign digit is zero. 
If the 175 and 0’s of the above mentioned 2-4-2-1 code or the excess-3 code for 


232 Chap.6 SERIAL ARITHMETIC UNITS 


a decimal number are self-complemented, the resulting decimal number is the 9’s 
complement of the original decimal number. Such a code is called self-complementing. 
A decimal number in a self-complementing code can readily give its 9's complement. 

The methods for decimal addition and subtraction are similar to those for a 
binary number. However, the addition or subtraction of two digits differs from that 
of two bits if decimal digits are represented by the BCD numbers. Let us illustrate 
this difference by adding decimal numbers 795 and 683 which are expressed in the 
8-4-2-1 code as shown below. 


Decimal Hundreds Tens Units 
1 
795 0111 1001 0101 
+683 +0110 carry +1000 +0011 
71,478 1110 11,0001 1000 
+0110 +0110 
1,0100(4) 0111(7) 1000(8) 


(carry in the thousands position) 


In the above, the digit sum 1000 in the unit position is correct, but the digit 
sum 0001 in the ten’s position is incorrect. Not only is the digit sum 1110 in the 
hundred’s position incorrect, but the carry is also missing. However, if the binary 
number 0110 is added to both tens and hundreds positions, the digit sums now all 
become correct and the carry in the thousands position also appears. Therefore, for 
a given choice of decimal digit representation, a correction algorithm should be 
developed so that two decimal digits can be added or subtracted correctly. 

In summary, the decimal number for the decimal arithmetic unit is chosen to be 
in the signed magnitude representation. The decimal digit is_represented by the 
8-4—-1 code. The numbers are integers of eight decimal digits and a sign bit. 


6.6.2 Modes of Operation 


A binary coded decimal number may be represented by one or more time se- 
quences of 1’s and 0’s. There are four such time representations: 


1. The serial-digit serial-bit representation 
2. The parallel-digit serial-bit representation 
3. The serial-digit parallel-bit representation 


4. The parallel-digit parallel-bit representation. 


As an example, consider decimal number 951. For the 8-4—2-1 code, it becomes 
1001, 0101, 0001. The four representations of this number are shown in Fig. 6.13, 
Where t, indicates the time that advances as index i increases. In Fig. 6.13(a), all digits 


Sec. 6.6 Organization of a Decimal Arithmetic Unit 233 


t2 t4 tjj tg tg tj; tt t t, tg t t 


1001 01 010001 


EE ЕЕ 
АА. 


(b) 
Bk t, 
111 
ооо 
O 1 0 
1 0 0 
(9) (5) (1) 

(с) 

1 

0 

0 1 

0 

1 

0 

1 5 

0 

1 

0 

0 9 

1 

(а) 


Fig. 6.13 Four representations of a binary coded decimal 
number: (a) serial-digit serial-bit representation; (b) 
parallel-digit serial-bit representation; (c) serial-digit 
parallel-bit representation; (d) parallel-digit parallel- 
bit representation 


and all bits appear in one binary sequence; it is the serial-digit serial-bit represen- 
tation. When the digits are arranged in parallel and the bits in series as shown in 
Fig. 6.13(b), it is the parallel-digit serial-bit representation. If the digits occur in 
series while the bits occur in parallel as shown in Fig. 6.13(c), it is the serial-digit 
parallel-bit representation. In Fig. 6.13(d), all digits and all bits appear in parallel; 
it is the parallel-digit parallel-bit representation. 

The addition of two decimal numbers can be achieved by using a decimal adder 


234 Chap.6 SERIAL ARITHMETIC UNITS 


which consists of one or more decimal-digit adders. A decimal-digit adder can be 
serial if the two decimal digits are-added serially (i.e., bit by bit) or parallel if they 
are added in parallel (i.e., all bits at the same time). These adders are called the 
serial decimal-digit adder (SDDA) and the parallel decimal-digit adder (PDDA). The 
manner in which two decimal numbers are added depends on how the BCD number 
is represented. There are four modes of operations for the four time representations 
of the BCD number. If the BCD number is in the serial-digit serial-bit representation, 
the addition requires the use of an SDDA, and it is in the serial made of operation. 
If the BCD number is in the parallel-digit serial-bit representation, the addition 
requires the use of as many SDDA’s as the number of digits in the decimal number, 
and it is in the parallel-digit mode of operation. If the BCD number is in the serial- 
digit parallel-bit representation, the addition requires the use of a PDDA, and it is 
in the serial-digit mode of operation. If the BCD number is in the parallel-digit parallel- 
bit representation, the addition requires the use of as many PDDA’s as the number 
of digits in the decimal number, and it is in the parallel mode of operation. 

Similarly, there are four modes of operations for subtracting one decimal number 
from another. For these modes of operations, one or more serial decimal-digit sub- 
tracter (SDDS) or one or more parallel decimal-digit subtracter (PDDS) is required. 

It is possible to combine an adder and a subtracter into an adder-subtracter. 
In this case, there are the serial decimal-digit adder-subtracter (SDDAS) and the 
parallel decimal-digit adder-subtracter (PDDAS). 

For the decimal arithmetic unit, the BCD numbers are chosen in the serial-digit 
parallel-bit representation. Therefore, a PDDA, a PDDS, or a PDDAS is required; 
the particular choice depends on the arithmetic algorithms employed. 


6.6.3 Decimal-digit Adders and Subtracters 


A decimal-digit adder adds an addend digit, an augend digit, and an input-carry 
bit to produce a sum digit and an output-carry bit. The decimal-digit adders and 
subtracters described here are those for the binary coded decimal digit using the 
8-4-2-1 code. A parallel decimal-digit adder (PDDA) is shown in the block diagram 
of Fig. 6.14. There are three inputs: the 4-bit augend X(1-4), the 4-bit addend Y(1-4), 
and the single-bit input-carry C. There are two outputs: the 4-bit sum Z(1-4) and the 
single-bit output-carry Z(0). 

The use of the 8-4-2-1 code allows the use of a binary adder as a part of the 
decimal-digit adder. The PDDA in Fig. 6.14 consists of a 4-bit parallel binary adder 
and a correction logic network. Outputs W(0-4) from the binary adder give the 
binary sum of the inputs. There are 19 possible values of this binary sum shown in 
the left column of Table 6.4, where the carry is located at the left of the comma. 
The values in the right column are the respective correct sums 7(0-4). The correction 
logic network performs the function of converting the sum ir the left column to the 
corresponding one in the right column. À close examination of Table 6.5 reveals 
that simple correction rules can be formulated. The first 10 sums (0-9) are correct ; 
thus, no correction is needed. For each of the remaining 10 sums (10-19), an incre- 


20 21 2; 


Fig. 6.14 Block diagram of a parallel decimal-digit adder using 


the 8-4-2-1 code 


TABLE 6.4 Uncorrected and Corrected Digit Sums of 
Two Decimal Digits т the 8-4-2-1 Code 


UNCORRECTED 
Dicit Sum W(0-4) 


0,0000 
0,0001 
0,0010 
0,0011 
0,0100 
0,0101 
0,0110 
0,0111 
0,1000 
0,1001 


0,1010 
0,1001 
0,1100 
0,1101 
0,1110 
0,1111 
1,0000 
1,0001 
1,0010 
1,0011 


Correction logic network 


4-bit parallel binary adder 


CORRECTED 


Dicit Sum 7(0-4) 


0,0000 = 0 
0,0001 = 1 
0,0010 = 2 
0,0011 = 3 
0,0100 = 4 
0,0101 = 5 
0,0110 = 6 
0,0111 = 7 
0,1000 = 8 
0,1001 = 9 
1,0000 = 10 
1,0001 = 11 
1,0010 = 12 
1,0011 = 13 
1,0100 = 14 
1,0101 = 15 
1,0110 = 16 
1,0111 = 17 
1,1000 = 18 
1,1001 = 19 


235 


236 Chap.6 SERIAL ARITHMETIC UNITS 


ment of 6 is required; therefore, the sum W(0-4) in the left column is increased by 
6 to give the correct code 7(0-4) in the right column. 

The above PDDA using the 8-4-2-1 code is now defined as operator ‘“‘decadd”’ 
by the following CDL statements. 


Comment, spec. of a parallel decimal-digit adder using the 8-4-2-1 code (6.19) 
Operator, Z(0-4)— Х(1-4) decadd Y(1-4)-C 
Comment, C is the input carry, while W(0) is the output carry ` 
Terminal, CB(4)=C, 
CB(0-3)— X(1-4)*Y(1-4) + Y(1-4)«CB(1-4) + CB(1-4)* X(1-4), 
W(0) — CB(0), 
W(1-4) =Х(1-4)ФҮ(1-4)ФСВ(1-4), 
Comment, УҮ(1-4) аге the sum outputs of the 4-bit parallel binary adder 
/begin/ IF (W=0+1+...+9) THEN (Z=W), 
IF (ХУ--10--11--...--19) THEN (Z=W add 6), 


end of operator 


The above terminal statement describes the 4-bit parallel binary adder with terminals 
W(0-4) as its outputs. The two conditional micro-statements describe the correction. 

A parallel decimal-digit subtracter (PDDS) can be similarly defined. It consists 
of a 4-bit binary subtracter and a correction logic network. The inputs to the binary 
subtracter are: the 4-bit minuend X(1-4), the 4-bit subtrahend Y(1-4), and the input 
borrow C. The outputs from the binary subtracter are terminals W(0-4). Terminal 
W(0) is the output-borrow, and terminals W(1-4) are the difference. The outputs 
from the PDDS are terminals Z(0-4); Z(0) is the output carry and Z(1-4) the differ- 
ence. The 20 uncorrected borrow and difference W(0-4) and the corresponding 
corrected borrow and difference 7(0-4) are shown in Table 6.5, where the borrow 
enclosed by parentheses is located at the left of the comma. The correction logic 
network converts the difference in the left column to the respective one in the right 
column. Whenever there is a borrow, the digit is negative and it is in the 10's comple- 
ment. Examination of Table 6.5 reveals again that simple correction rules can be 
formulated. The first 10 sums (0-9) are correct, thus, no correction is needed. For 
each of the remaining 10 sums (—1 to — 10), a decrement of 6 is required; therefore, 
the difference W(0-4) in the left column is decreased by 6 to give the correct code 
Z(0-4) in the right column. 

The above PDDS using the 8-4—2-1 code is now defined as operator **decsub" 
by the following CDL statements 


Comment, spec. of a parallel decimal digit subtracter using the 8-4-2-1 code 
Operator, 2(0-4)=Х(1-4) decsub Y(1-4)-C (6.20) 


Sec. 6.6 Organization of a Decimal Arithmetic Unit 237 


TABLE 6.5 Uncorrected and Corrected Digit Difference 
Of Two Decimal Digits т the 8-4-2-1 Codet 


UNCORRECTED CORRECTED 
Dicir DIFFERENCE W DiGiT DIFFERENCE Z 


0,1001 0,1001 = 9 
0,1000 0,1000 = 8 
0,0111 0,0111 = 7 
0,0110 0,0110 = 6 
0,0101 0,0101 = 5 
0,0100 0,0100 = 4 
0,0011 0,0011 = 3 
0,0010 0,0010 = 2 
0,0001 0,0001 = 1 
0,0000 0,0000 = 0 
(1),1111 (1,1001 = —1 
(1,1110 (1,1000 = —2 
(1,1101 (1),0111 = —3 
(1),1100 (1),0110 = —4 
(1),1011 (1),0101 = —5 
(1),1010 (1),0100 = —6 
(1),1001 (1),0011 = —7 
(1),1000 (1),0010 = —8 
(1),0111 (1),0001 = —9 
(1),0110 (1),0000 = —10 


TThe 1 enclosed by parentheses denotes a borrow; 
the digit is in the 10’s complement 


Comment, C is the input borrow, while W(0) is the output borrow 

Terminal, CB(4)=C, 
CB(0-3) =X(1-4)’* Y(1-4)+ Y(1-4)*CB(1-4)-+ CB(1-4)* X(1-4)', 
W(0)=CB(0), 
%/(1-4)=Х(1-4)®Ү (1-4)©СВ(1-4), 

Comment, W(1-4) аге the difference outputs of the 4-bit parallel binary sub- 

tracter 

/begin/ IF (W(0)=0) THEN (Z=W), 
IF (W(0)—1) THEN (Z=W sub 6), 
end of operator 

The terminal statement above describes the 4-bit parallel binary subtracter where 


W(0-4) are the outputs of this subtracter. The two conditional microstatements 


describe the correction. 
The above PDDA and PDDS can be combined into a PDDAS. In this case, 


238 Chap.6 SERIAL ARITHMETIC UNITS 


input М is needed. The PDDAS functions as an adder when М is 1 and as a sub- 
tracter when М is 0. This PDDAS using the 8-4-2-1 code is now defined as operator 
*decaddsub" by the following CDL statements. 


Comment, spec. of a parallel decimal-digit adder-subtracter using the 8-4-2-1 
code " (6.21) 


Operator, 7(0-4)-- Х(1-4) decaddsub У(1-4)-С-М, 
Comment, when М is 1 ог 0, it performs as an adder ог subtracter, respectively 
Terminal, CB(4)=C, 


CB(0-3)=(N@X(1-4))*Y(1-4)+(N@X(1-4))*CB(1-4) 
+ Y(1-4)«CB(1-4), 


W(0)=CB(0), 

№(1—4)=Х(1-4)@Ү(1-4)ФСВ(1-4), 
Comment, when М is 1 or 0, C is the input carry or input borrow, respectively 
/begin/ ТЕ (N*«(W —0-4-14-. ..+9)+N’*W(0)’) THEN (Z=W), 

ТЕ (N*(W х 10+11+...+19)) THEN (Z=W add 6), 

IF (N'«W(0)) THEN (Z=W sub 6), 


end of operator 


The terminal statement above defines the binary adder-subtracter. The three condi- 
tional micro-statements describe the correction. 


6.6.4 A Decimal Serial-digit Arithmetic Unit 


A decimal arithmetic unit for decimal numbers in the serial-digit parallel-bit 
representation is shown in the block diagram of Fig. 6.15 where the above defined 
PDDAS together with the single-bit registers C and N are employed. This decimal 
arithmetic unit is similar to the binary, serial arithmetic unit shown in the block 
diagram of Fig. 6.2. 

As shown in Fig. 6.15, there are three array-registers А(1-4, 0-8), О(1-4, 1-8), 
and R(1-4, 0-8) which аге the decimal accumulator, the multiplier-quotient register, 
and the buffer register of the memory, respectively. The elements of array-register 
A are denoted by the subscripts in the manner shown in Fig. 6.16. Array-register A 
can store nine digits, though the chosen decimal number has only eight decimal 
digits; the extra digit storage is for use during multiplication and division. The extra 
digit storage in array-register R, though not needed, is provided for convenience. 
Single-bit registers SA, SQ, and SR store the signs of the decimal numbers in array- 
registers А, Q, апа В, respectively. The functions of these three array-registers during 
addition, subtraction, multiplication, and division are the same as those shown in 
Table 6.1 except that, during division, the dividend is 16 eer and is stored in array- 
registers A and Q. 


Array register R(1-4, 0-8) 


Z (0) 
PDDAS 
Z(1-4) 


DVC counter 


Array-register A(1-4, 0-8) Array-register Q(1-4, 1-8) 


SR SA 50 DC(1-4) 


OV AV DV WC(1-4) 


SUM DIF DSTEST 


Fig. 6.15 Configuration of a decimal arithmetic unit using the 
parallel-digit serial-bit representation 


Code 
weight 
PALO TAG Гапа [м8] 8 
БІНЕ БІЛІБ ВЕ | Пава 4 
азо азо аза | раза 2 
лай [дал ааа | 8] | 


Fig. 6.16 Block diagram showing denotation of the elements of 
array-register A(1—4, 0-8) 


239 


240 Chap.6 SERIAL ARITHMETIC UNITS 


In addition, there are eight single-bit registers: SUM, DIF, DSTEST, OV, AV, 
DV, C, and N. These registers serve the same purposes as those shown in the block 
diagram of Fig. 6.2 and in statement (6.7). There are three counters: DV, DVC, 
and WC. Digit counter DC is used during addition and subtraction, while digit-value 
counter DVC and word counter WC are employed during multiplication and division. 

The configuration of this-decimal arithmetic unit is now described by the follow- 
ing CDL statements: 


Comment, configuration of a decimal arithmetic unit (6.22) 
Array-register, A(1-4,0-8), Фаггау accumulator 
О(1-4,1-8) $multiplier-quotient array register 
R(1-4,0-8), $buffer array-register 
Register, SA, $store sign of the number in A 
SQ, $store sign of the number in Q 
SR, $store sign of the number in R 
Array-casregister, AQ(0--16)=A-Q, 
Register, SUM, $control register for add subsequence 
DIV, $control register for subtract subsequence 
DSTEST, $control register for divide-stop test 
subsequence 
ОУ, $temporary register for carry or borrow 
AV, $add overflow indicator 
DV, $divide overflow indicator 
С. $carry register 
N, $add-sub control register (add when 1) 
Register, DVC(1-4), $digit value counter 
DC(1-4), $digit counter 
WC(1-4), $word counter 
Comment, terminals from the parallel decimal-digit adder-subtracter. 
Terminal, 7(0-4)-- A(,8) decaddsub R(,8)-C-N, 


Тһе above terminal statement describes the inputs and outputs of the PDDAS as 
well as the PDDAS itself. The previously described operator “‘decaddsub”’ in state- 
ment (6.21) should become a part of the description of this configuration. 


6.6.5 Decimal Addition and Subtraction 


Addition and subtraction in this decimal arithmetic unit use the direct addition 
and direct subtraction methods (4). Initially, the augend or the minuend is stored in 


Sec. 6.6 Organization of a Decimal Arithmetic Unit 241 


array-register А and the addend or subtrahend in array register В. After the addition 
or subtraction, the sum or the difference is stored in array register A and the augend 
or minuend is lost. (The addend in array register В may also be lost.) 

The decimal addition and subtraction sequence is shown in the sequence charts 
of Figs. 6.17 and 6.18. The sequence chart in Fig. 6.18 shows the SUM-DIF sub- 
sequence which is “called” during the addition and subtraction sequence when register 
SUM or DIF is set to 1. 

As shown in Fig. 6.18, the SUM-DIF subsequence is continuously examining 
registers SUM and DIF in the two waiting loops. When register SUM or DIF is found 
to contain a 1, the subsequence begins by setting register М to 1 or 0, respectively. 
The subsequence is then initialized by resetting the registers DC, C, and OV, to 0. 
Now, the digit addition loop begins. Outputs 7(0) and 7(1-4) from the PDDAS are 
transferred to register C and subregister A(,8), respectively. Array registers А and В 
are then circularly shifted to the right one digit position and counter DC is incremented 
by 1. Counter DC is next tested for the value of 9. If it is not 9, the subsequence 
returns to the beginning of the digit addition loop and repeats the loop. During the 
execution of the loop, when counter DC reaches 8, the contents of register C are 
transferred to register OV, because if register C contains a 1, it indicates an overflow. 
When counter DC reaches 9, the addition or subtraction is completed. Register SUM 
or DIF is reset to O if register N is 1 or 0, respectively. The subsequence now returns 
to the waiting loops. 

The addition and subtraction sequence in Fig. 6.17 begins by setting register 
SUM or DIF to 1 to call the SUM-DIF subsequence, and then waits for its comple- 
tion. When the subsequence is completed, the addition and subtraction sequence 
continues on by testing register OV. In the case of addition, if register OV contains 
a 1, it indicates that an overflow has occurred and register AV is then set to 1; the 
sequence is terminated. In the case of subtraction, if register OV contains a 1, it indi- 
cates that a borrow has occurred and the difference is in the 10's complement repre- 
sentation. In this case, the difference is 10's complemented again to give the signed 
magnitude representation. This is accomplished by complementing the contents of 
register SA and by transferring the contents of array register A to array register R. 
Array register А is then reset to 0 and the SUM-DIF subsequence is called for a 
subtraction. When the subtraction is completed, the sequence is terminated. If register 
OV does not contain a 1, the sequence is terminated. 


6.6.6 Decimal Multiplication 


The multiplication in this decimal arithmetic unit is accomplished by repeated 
addition of the multiplicands (4). Initially, the multiplier is stored in array register 
Q and the multiplicand in array register R. After the multiplication, the product is 
stored in array registers Q and A and the multiplier is lost. 

A decimal multiplication sequence is shown in the sequence chart of Fig. 6.19. 
The addition required in the multiplication is performed by the SUM-DIF subse- 
quence. 

As shown in Fig. 6.19, there are two loops, an inner loop and an outer loop. 


DADD entry DSUB entry 


Exit 


Fig. 6.17 Sequence chart for the decimal addition and subtrac- 
tion sequence 


BAD 


C<Z{0), 


A(, 8)-2(1-4), 


Ағ<сіг A, 
R<cir В, 
DC<countup DC, 


SUM-O 


Fig. 6.18 Sequence chart for the SUM-DIF subsequence 


243 


244 Chap.6 SERIAL ARITHMETIC UNITS 


DMPY entry 


А<0, 
\/С=0, 
О\С-О(, 8) 


AQ<shr AQ, 
WC<+countup WC, 
E 
SO-SOQSSR, 
SA-SQSSR, 


Exit 


Fig. 6.19 Sequence chart for the decimal multiplication 
sequence 


After registers A and WC are reset to 0, the outer rightshifting loop begins by trans- 
ferring the contents of subregister Q(,8) to counter DVC. The contents of counter 
DVC now indicate the number of additions required for the digit in subregister Q(,8). 
Then, the inner digit addition loop begins by testing the contents of counter DVC. 
If counter DVC is not 0, the contents of array registers А and R'are added and counter 
РУС is incremented; the sequence then returns to the beginning of the digit addition 
loop. The inner loop is now repeated until counter DVC reaches 0. А 0 in counter 


“ 


Sec. 6.6 Organization of a Decimal Arithmetic Unit 245 


DVC indicates that no addition is required and the digit addition loop is terminated. 
But the outer loop continues on by shifting array casregister AQ one digit position to 
the right and incrementing counter WC by 1. Counter WC is then tested for the value 
of 8. If it is not 8, the sequence returns to the beginning of the right-shifting loop. 
The outer loop is repeated until counter WC reaches 8. At this time, the sign is deter- 
mined and placed in registers SA and SQ. The multiplication is now completed. 


6.6.7 Decimal Division Algorithm 


Decimal division makes use of the restoring algorithm (4). Initially, the divisor 
is stored in array register R and the 16-digit dividend іп array casregister AQ. After 
the division, the quotient is in array register Q and the remainder in array register 
A; the dividend is lost. Division overflow is indicated by register DV. 

The decimal division sequence is shown in the sequence charts of Figs. 6.20 
and 6.21. Addition and subtraction required in the division are performed by the 
SUM-DIF subsequence, and division overflow test by the DSTEST subsequence. 
As shown in Fig. 6.20, the division begins by resetting register WC to 0 and by calling 
the DSTEST subsequence for testing division overflow. The DSTEST subsequence 
shown in Fig. 6.21 begins by calling the SUM-DIF subsequence which subtracts 
the divisor in array register R from the more significant part of the dividend in array 
register A and stores the difference in array register A. Then, register OV is tested for 
the borrow. If register OV contains a 0, the division overflow occurs and register DV 
is set to 1. Otherwise, the division overflow does not occur and the dividend is restored 
by calling the SUM-DIF subsequence to add the divisor in array register R to the 
difference in array register A. At this time, the execution of the DSTEST subsequence 
is completed and the subsequence is terminated by resetting register DSTEST to 0. 

After returning from the DSTEST subsequence, the division sequence tests 
register DV for 1. If it is 1, it indicates that the division overflow has occurred and 
the division is terminated. If it is 0, the division proceeds. Since the successful test of 
the division overflow means that the contents in array register A are smaller than the 
divisor in array register R, the dividend in array casregister AQ is multiplied by ten 
by shifting one digit position to the left. 

As shown in Fig. 6.20, there are two loops, an inner loop and an outer loop. 
Now, the outer leftshifting loop begins by resetting counter РУС to 0. Next, the inner 
digit-subtraction loop begins. The inner loop calls the SUM-DIF subsequence for 
a subtraction and tests whether a borrow has occurred during the subtraction. If the 
borrow has not occurred, counter DVC is incremented by 1 and the sequence returns 
to the beginning of the inner loop. The digit subtraction loop is repeated until the 
borrow occurs and the inner loop is terminated. The presence of the borrow indicates 
that the contents of counter DVC are the value of the quotient digit; this value is 
transferred to subregister Q(,8). Counter WC is next incremented by 1 and then 
tested for the value of 8. If it is not 8, array casregister AQ is shifted to the left one 
digit position and the sequence returns to the beginning of the outer loop. The outer 


246 


DDIV entry 


М/С<0, 
DSTEST<1, 


DVC<countup DVC 


AQ<shi AQ 


SQ-SASSR, 
SA-SASSR, 


Exit 


Fig. 6.20 Sequence chart for the decimal division sequence 


Sec. 6.7 Decimal Multipliers and Dividers 247 


DSTEST=1 = 
DIF«1 


DSTEST-O 


Fig. 6.21 Sequence chart for the divide-stop test subsequence 


loop is repeated until counter WC reaches 8. At this time, the quotient and remainder 
signs are determined and placed in registers SQ and SA, respectively. The division is 
now completed. 


6.7 Decimal Multipliers and Dividers 


Instead of using the repeated addition algorithms for decimal multiplication and 
the repeated subtraction algorithm for decimal division, there are other organizations 
for decimal multipliers and dividers. The organizations for three decimal multipliers 
and one decimal divider are now presented. These multipliers and divider are in the 
serial-digit mode of operation. In the following, the term addition or subtraction time 


248 Chap.6 SERIAL ARITHMETIC UNITS 


denotes the time required for adding one decimal number to or subtracting one deci- 
mal number from another decimal number in the serial-digit mode of operation. 


6.7.1 A Multiplier Using the Nine Multiples 
of the Multiplicand 


The configuration of a serial-digit multiplier using the nine multiples of the 
multiplicand (3, 4) is shown in the block diagram in Fig. 6.22. The configuration and 


Parallel 
decimal 

digit 
adder 


Register C 


Multiplicand 
comparator 
and selector 


Array register О Array register A 


Fig. 6.22 Configuration of a decimal multiplier using nine 
multiples of the multiplicand 


sequential operation of this multiplier are similar to the unit shown in Fig. 6.15. 
There are two major differences: one is the use of nine array registers for storing the 
nine multiples of multiplicand X instead of one array register R, and the other is 
the use of a multiplicand comparator-selector to replace counter DVC. The multi- 
plicand comparator-selector detects the next multiplier digit to be multiplied and then 
selects the proper multiple of the multiplicand to be added to the partial product. 
The nine multiples of the multiplicand may be obtained as follows. Initially, 


Sec. 6.7 Decimal Multipliers and Dividers 249 


array register A is cleared and multiplicand X is stored in array register R1. An 
addition is performed; the sum which is X is left in A. This sum in A is transferred 
to register R2; at the same time, the sum in A and multiplicand X in R1 are added 
again to produce 2X in array register А. This process continues on until all nine 
multiples are generated and stored in array registers R1, R2,..., and R9. The 
multiplication time requires as many addition times as the number of digits in the 
decimal number, in addition to the nine addition times for generating the nine 
multiples. 

This multiplier has the disadvantage of requiring nine array registers for storing 
the nine multiples. The number of array registers, however, can be reduced by noting 
that multiplication by 6-9 can be achieved by subtracting 4-1 from 10, respectively; 
thus, only the first five multiples in addition to an adder-subtracter are needed. 


6.7.2 A Multiplier Using the Doubling-and-halving 
Method 


Figure 6.23 shows the configuration of a serial-digit multiplier using the doubling- 
and-halving algorithm (3, 4). This algorithm requires the multiplicand to be doubled 


Doubling 
logic network 


Array register R 


B Parallel 
— decimal 


digit 
adder 


Array register Q Odd number 


RD OG үү detecting Г] 


logic network Register W 


Halving 
logic network 


Fig. 6.23 Configuration of a decimal multiplier using the dou- 
bling-and-halving method 


Array register А 


and the multiplier to be halved. As shown in Fig. 6.23, there are array registers К, 
Q, and A which store the multiplicand, multiplier, and product, respectively. A dou- 
bling logic network is inserted on the path for circulating the multiplicand in array 


250 Сһар. 6 SERIAL ARITHMETIC UNITS 


register R, and a halving logic network on the path for circulating the multiplier in 
array register Q. 

The multiplication begins by examining whether the multiplier 1s odd or even. 
If it is even, the multiplicand is doubled by circulating the doubling logic network 
and the multiplier is halved by circulating through the halving logic network. If the 
multiplier is odd, the multiplicand is still doubled, but the multiplier is first subtracted 
by 1 at its least significant digit position and is then halved. This completes one addi- 
tion time. During the addition, the multiplicand while circulating is also added to the 
partial product in array register A if the multiplier is odd. This process of doubling 
the multiplicand, halving the multiplier, and generating the partial product during 
one addition time is repeated after each halving of the multiplier until the contents 
in array register Q become zero. At this time, the product is in array register A. If 
the 2n digits of the product are to be preserved, additional array registers are required 
and the number of registers thus required is larger than that in the configuration of 
Fig. 6.15. 


6.7.3 A Multiplier Using a Built-in 
Multiplication Table 


The configuration of a serial-digit multiplier using a built-in multiplication table 
is shown in Fig. 6.24. The multiplication table is implemented by a logic network 
called single decimal-digit multiplier which multiplies two decimal digits and produces 
a product of two decimal digits. 

As shown in Fig. 6.24, there are array registers R, Q, and A in addition to carry 
registers СІ and C2. The multiplicand and the multiplier are stored in array registers 
К and О, respectively. There аге а 4-bit register СІ and a single-bit register C2 and 
two adders, the first decimal-digit adder and the second decimal-digit adder. 

The multiplication begins by multiplying the two least significant digits in array 
registers R and Q by the single decimal-digit multiplier. The two digit number from 
the single decimal-digit multiplier is then added to register СІ which stores the pre- 
vious digit carry. This digit carry is initially 0, and has a value of 0, 1,..., or 9. 
This addition is carried out in the first decimal-digit adder which produces two digits; 
the less significant digit is then stored in register СІ and the more significant digit is 
added to the partial product in array register A and carry in register C2 by the second 
decimal-digit adder. The second decimal-digit adder produces a single-bit carry and a 
digit sum. The carry bit is then stored in register C2 and the sum digit is shifted into 
array register A as a part of the partial product. 

After the multiplication of the two least significant digits in array registers R and 
Q, array registers R and Q are shifted once so that the next least significant digits can 
be multiplied by the single decimal-digit multiplier. The single decimal-digit multi- 
plier gives another two-digit number which is again added by the digit carry in register 
СІ by the first decimal-digit adder to produce another two-digit sum. This two-digit 
sum is then added to the partial product in array register A and the bit carry in register 
C2 by the second decimal-digit adder. The digit sum from the second decimal-digit 


Sec. 6.7 Decimal Multipliers and Dividers 251 


Array register R 


Single 
decimal 
digit 
multiplier 


Second 
decimal 
digit 

adder 


Register C2 
Array register A 


Array register Q 


Fig. 6.24 Configuration of a decimal multiplier with a built-in 
multiplication table 


adder is then shifted into array register A, while array registers R and Q are being 
shifted so that the next two least significant digits can be multiplied. This process 
continues on until all the digits of the decimal numbers in array registers R and Q 
are multiplied. At this time, the product is in array registers Q and A. 

The first decimal-digit adder in the above decimal multiplier is rather complex 
because of addition of three decimal digits. The second decimal-digit adder can make 
use of the previously described parallel decimal-digit adder if the decimal number is 
in the 8-4—2-1 code. The multiplication time requires as many addition times as the 
number of digits in the decimal number. 


6.7.4 A Divider Using the Nine Multiples 
of the Divisor 


The configuration of a serial-digit decimal divider using the nine multiples of the 
divisor is similar to that of the decimal multiplier using the nine multiples of the 
multiplicand in Fig. 6.22. There are nine array registers, R1,..., and R9, which 


252 Chap.6 SERIAL ARITHMETIC UNITS 


store the nine multiples in addition to array registers Q and A. The multiplier is stored 
in array register О. However, a decimal-digit adder-subtracter is required; the adder 
is used to generate the nine multiples and the subtracter to perform the division by 
repeated subtraction. Furthermore, it requires a comparator which can perform 
simultaneous comparisons of the nine multiples with partial remainder and then 
select the largest multiple which makes the next partial remainder positive. From the 
selected multiple, a quotient digit is generated and the selected multiple is subtracted 
from the current partial remainder to give the next partial remainder. After the divi- 
sion, the quotient is stored in array register A. 

The comparator in the decimal divider mentioned above is rather complex and 
impractical in current technology. The division time requires as many subtraction 
times as the number of digits in the decimal number besides the nine addition times 
for generating the nine multiples and the time for comparing and selection. 


References 


1. BURKS, А. W., GorpsriNE, H. H., and VON NEUMANN, J., “Preliminary Discussion of the 
Logical Design of an Electronic Computing Instrument," Inst. Advanced Study Rept. 
1, pt. 1, June 28, 1946. 


2. SHAW, В. F., “Arithmetic Operations in a Binary Computer,” Rev. Sci. Instr., August, 
1950, pp. 687-793. 


3. Staff of Harvard Computation Laboratory, Synthesis of Electronic Computing and Control 
Circuits. Cambridge, Mass.: Harvard University Press, 1951. 


4. Сно, У. Digital Computer Design Fundamentals. New York: McGraw-Hill Book Com- 
pany, 1962. 


Problems 


6.1. A two-bit binary adder adds a two-bit augend, a two-bit addend, and a single bit 
carry at the same time (i.e., not to use two single-bit full adders) to produce a two-bit 
sum and a single-bit carry. Describe a two-bit adder by a terminal statement. 


6.2. A two-bit subtracter subtracts a two-bit subtrahend and a single-bit borrow from a 
two-bit minuend at the same time, and produces a two-bit difference and a single-bit 
borrow. Describe a two-bit subtracter by а terminal statement: 


6.3. А two-bit adder-subtracter functions as a two-bit adder when a single-bit register М 
contains а 1, and as a two-bit subtracter when register М contains a 0. Describe the 
two-bit adder-subtracter by a terminal statement. 


Problems 253 


6.4. 


6.5. 


6.6. 


6.7. 


6.8. 
6.9. 


6.10. 


6.11. 


6.12. 


6.13. 


6.14. 


6.15. 


6.16. 


Modify the sequence charts for the addition and subtraction sequence in Fig. 6.5 and 
for the SUM-DIF subsequence in Fig. 6.6 if the two-bit adder-subtracter described in 
Problem 6.3 is used instead of the single-bit adder-subtracter. 


Modify the sequence chart for the multiplication sequence in Fig. 6.8 if the two-bit 
adder-subtracter is used. 


Modify the sequence charts for the binary division sequence in Fig. 6.10 and for the 
divide-stop test subsequence in Fig. 6.11 if the two-bit adder-subtracter is used. 


Modify the configurations of the binary, serial arithmetic unit in Figs. 6.2 and 6.12 
and the sequence charts for the addition and subtraction sequence in Fig 6.5, the SUM- 
DIF subsequence in Fig. 6.6, the multiplication sequence in Fig. 6.8, the division 
sequence in Fig. 6.10, and the divide-stop test subsequence in Fig. 6.11 if the binary 
numbers аге іп the signed 175 complement representation. 


Repeat Problem 6.7 if the binary numbers are in the signed magnitude representation. 


Conceive the control part of the configuration for the decimal serial digit arithmetic 
unit in statements 6.22 and describe the following sequences by the CDL statements: 
(a) the decimal addition and subtraction sequence in Fig. 6.17, 

(b) the SUM-DIF subsequence in Fig. 6.18, 

(c) the decimal multiplication sequence in Fig. 6.19, 

(d) the decimal division sequence in Fig. 6.20, 

(e) the divide-stop test subsequence in Fig. 6.21. 


Define a parallel decimal-digit adder for the BCD digit using the excess-3 code by a 
terminal statement. 


Define a parallel decimal-digit subtracter for the BCD digits using the excess-3 code 
by a terminal statement. 


Define a parallel decimal-digit adder-subtracter for the BCD digit using the excess-3 
code by a terminal statement. Register N is the add-subtract control register. When 
register М contains a 1 the adder-subtracter functions as an adder; otherwise, it func- 
tions as a subtracter. 


Modify the sequence chart for the decimal addition and subtraction sequence in Fig. 
6.17 and the sequence chart for the SUM-DIF subsequence in Fig. 6.18 if the parallel 
decimal-digit adder-subtracter using the excess-3 code described in Problem 6.11 is 
used instead of the decimal-digit adder-subtracter using the 8-4-2-1 code. 


Modify the sequence chart for the decimal multiplication sequence in Fig. 6.19 if the 
decimal-digit adder-subtracter using the excess-3 code is used. 


Modify the sequence chart for the decimal division sequence in Fig. 6.20 and the 
sequence chart for the divide-stop test subsequence in Fig. 6.21 if the decimal-digit 
adder-subtracter using the excess-3 code is used. 


Conceive a configuration including the control part for the decimal serial-digit arith- 
metic unit and describe the following sequences by the statements: 

(a) the decimal addition and subtraction sequence obtained from Problem 6.13, 

(b) the SUM-DIF subsequence obtained from Problem 6.13, 

(c) the decimal multiplication sequence obtained from Problem 6.14, 

(d) the decimal division sequence obtained from Problem 6.15, 

(e) the divide-stop test subsequence obtained from Problem 6.15. 


254 


6.17. 


6.18. 


6.19. 


6.20. 


Chap.6 SERIAL ARITHMETIC UNITS 


Describe a serial decimal-digit adder (SDDA) using the 8-4-2-1 code by a terminal 
statement. 


Describe a serial decimal-digit subtracter (SDDS) using the 8-4-2-1 code by a terminal 
statement. 


Describe a serial decimal-digit adder-subtracter (SDDAS) using the 8-4-2-1 code by a 
terminal statement. Register N is the add-subtract control register. When register N 
contains a 1, the adder-subtracter functions as an adder. Otherwise, it functions as a 
subtracter. 


Modify the following sequence charts for a decimal serial arithmetic unit which em- 
ploys the SDDAS obtained from Problem 6.19. 

(a) the sequence chart for the decimal addition and subtraction sequence in Fig. 6.17, 
(b) the sequence chart for the SUM-DIF subsequence in Fig. 6.18, 

(c) the sequence chart for the decimal multiplication sequence in Fig. 6.19, 

(d) the sequence chart for the decimal division sequence in Fig. 6.20, 

(e) the sequence chart for the divide-stop test subsequence in Fig. 6.21. 


Memory organization deals with the functional and operational aspects of digital 
computer memories. This chapter is limited to the descriptions of those memory 
organizations associated with the CPU (the so-called main memory). Besides 
the random-access memory and the associative memory, this chapter presents 
memory addressing, memory loading, memory stack, memory buffer, and virtual 
memory. Some details are shown in these descriptions so as to present some 
insight. As is evident from these descriptions, the organization of the main memory 
has been emerging with increasing importance in order that the hardware can 
become more effective. 


Memory Organization 7 


7.1 Random Access Memory 


Memory here refers to random access memory. А modern stored-program com- 
puter requires a random access memory where the program and data are stored and can 
be randomly accessed by the processor in executing the program. This section describes 
the functional organization of a random-access memory. Array organization, module 
organization, and multiple access organization are described in addition to a discus- 
sion of the types of random access memory. 


7.1.1 Array Organization 


Most of today's random access memories are magnetic core memories. Because 
of economical reasons, they are usually of the coincident-current type. In a coincident- 
current memory, one core stores one bit of information. The cores are assembled into 
a plane, called the memory plane, and the planes are assembled into a stack, called 
the memory array. For a single-array memory, usually there are as many cores in a 
plane as the number of words in the memory, and there are as many planes in the 
stack as the number of bits in the word. For example, a single-array magnetic-core 
memory having 16,384 72-bit words may have 72 memory planes with each plane 
having 128 by 128 cores. The memory plane of a coincident-current memory should 
be built as large as possible, because larger planes account for a more economical 
memory. À memory plane with 128 by 128 cores is one of the most practical arrange- 
ments. 

The block diagrams in Fig. 7.1 show functional organizations of the random 
access memory. The memories consist of the memory array, the address register, the 
buffer register (or called data register), the terminals for initiating a read or a write 
operation, and the memory control. Often, there are the data bus and the address 
bus, also shown in Fig. 7.1. The memory array can be a single array as shown in Fig. 
7.1(a), or multiple arrays as shown in Fig. 7.1(b); in either case, there are only one 
address register and one buffer register. 

The memory reads or writes in a memory cycle. For a memory cycle performing 
a read operation, the address is first transferred to the memory. address register; the 
addressed word is then read out of the memory array into the buffer register; and the 
word is finally written back into the memory at the same location (1.е., restored) 
because of the destructive nature of the reading operation. For a memory cycle 
performing a write operation, the address transfer operation is the same; the reading 
of the addressed word into the buffer register is inhibited; instead of writing back 


Sec. 7.1 Random Access Memory 257 


Address bus 


Address register 


Memory 
control 


Memory array 


Buffer register 


Initiate read 
terminal 


Data bus 


Initiate write 
terminal 


(a) 


Address bus 


| Adress register | | Adress register | 


Buffer register 


Memory 
control 


Initiate read 
terminal 


Data bus 


Initiate write 
terminal 


(b) 


Fig. 7.1 Array organizations of the memory: (a) single array 
organization; (b) multiple array organization 


the word just read out, the word now in the buffer register is written into the memory. 

Some memories operate on independent read and write cycles. They can perform 
either a read cycle or a write cycle. In a read cycle, the word is read out of the memory 
into the buffer register and the memory location is left with zeros. In a write cycle, 
the data in the buffer register is written into the memory location which was set to 


258 Chap. 7 MEMORY ORGANIZATION 


zero by the previous read cycle. At the end of a read operation, the memory remains 
in read condition until a write operation is started. At the end of a write operation, 
the memory is set to read so that a write cycle is always preceded by a read cycle. 
A memory which operates on independent read and write cycles is said to be capable 
of split-cycle operation. A memory capable of split-cycle operation can perform the 
read cycle, the read-wait cycle, the read-wait-write cycle, the read-write-wait cycle, 
and the read-wait-write-wait cycle in addition to the read-write cycle. Thus, the 
memory may become more effectively used. 

The speed of memory operation is described by the memory See time and the 
memory access time. The memory cycle time is the read-write cycle time, while the 
memory access time is that part of memory cycle time during which the read operation 
is performed. The access time is often 0.3 to 0.7 of the memory cycle time. For a 
memory capable of split-cycle operation, there are two cycle times; the read cycle 
time and the write cycle time. The time for a write operation is the sum of the read and 
write cycle times. 

As previously mentioned, the number of memory planes in the array is the word 
length. The word length shows another characteristic of the memory: the data transfer 
width which is the number of bits transferred in parallel during one memory cycle 
time. The maximum transfer rate of the memory is the ratio of the data transfer 
width to the memory cycle time. For example, the memory with a cycle time of one 
microsecond and a word length of 64 bits has a maximum transfer rate of 64 million 
bits per second, or 8 million bytes per second (a byte consists of eight bits). 

In summary, the characteristics of a random-access memory are capacity, memory 
cycle time, memory access time, memory word length or data transfer width, cost per 
bit, and special ones such as split-cycle operation. 


7.1.2 Module Organization 


The capacity of a coincident-current magnetic-core memory can be increased 
by using multiple arrays. However, merely increasing the number of arrays does not 
increase the data transfer rate. A module organization can increase the memory 
capacity as well as the data transfer rate. 

In the multiple array organization in Fig. 7.1(b), if a pair of address and buffer 
registers are additionally provided to each array, then each group of the memory 
array, the address register, and the buffer register forms a module. Each module is 
usually self-contained. A memory organization with four modules is shown in Fig. 
7.2. The main memory unit of a digital computer may consist of one or more modules; 
the modular organization gives the flexibility of modular expansion of memory 
capacity. 

If the memory cycles of the four modules of the memory in Fig. 7.2 are staggered 
by a quarter of the memory cycle time, then after the first module starts its о 
cycle, the second, third, and fourth modules start their memory cycles at 2, 4, and 3 
cycle time later. If memory words are accessed in such a sequential order, then a 
maximum data transfer rate equal to four times the data transfer rate of one module 
becomes possible. In actuality, the memory is not randomly accessed and the average 


Sec. 7.1 Random Access Memory 259 


Address bus 


Address 
register 


Address registers 


Buffer registers 


Buffer 
register 


Data bus 


Fig. 7.22 Module organization of the memory 


data transfer rate lies somewhere between | and 4. The above mode of memory 
operation is called interleaving. The memory in Fig. 7.2 can permit a 2-way or 4-way 
interleaving. If instructions and operands are placed in two different modules, the 
instruction-fetch memory cycle and the operand-fetch memory cycle can be inter- 
leaved, thereby giving an effectively shorter memory cycle time; this is often called 
overlapping. In a memory buffering scheme to be described subsequently, the main 
memory has a 4-way interleaving, allowing access of a block of 4 memory words in 
one memory cycle time and thus contributing to effectiveness of memory buffering. 


7.1.3 Crossbar Switch 


A crossbar switch is a switching device which allows simultaneous interconnec- 
tions between members of two groups of modules. Electro-mechanical crossbar 
switches have long been used in the automatic telephone switching center. Because 


260 Chap. 7 MEMORY ORGANIZATION 


of the speed requirements, electronic crossbar switches are required for computer 
applications. A 4 by 4 crossbar switch is shown in Fig. 7.3, where any of the four 


Fig. 7.3 A crossbar switch connecting four memory modules 
(MU) and four processors (CPU) 


memory modules can be made to communicate with any of the four processors. 
Four simultaneous communications can be made. Since each line of the crossbar 
switch in Fig. 7.3 represents many wires, and the switching often needs to be extremely 
fast, a 4 by 4 electronic crossbar switch is a major system unit. Although the crossbar 
switch gives a flexibility of making connections between memory modules and pro- 
cessors, it is not flexible in allowing the number of memory modules or processors 
to be chosen for a particular application where the number is different from the num- 
ber designed for the crossbar switch. An alternative approach is to distribute the 
hardware of the crossbar switch to the four memory modules. In this way, each 
memory module has four sets of terminals, sometimes called fails or ports, to allow 
connections to as many as four processors. The distributed approach has one or more 
memory modules, each of which may have one or more processors; this permits 
gradual system growth. It is possible that memory modules have a different number 
of tails and that certain memory modules are private to particular processors but 
not other processors. The advantage of the distributed approach is that failure of one 
switching component disables only one memory module but not all memory modules. 


7.1.4 Multiple-access Organization 


The memory organization in Figs. 7.1 and 7.2 show a data bus by which the data 
are transferred to and from the memory. At any one time, only one word of data 


Sec. 7.1 Random Access Memory 261 


(i.e., transfer width) can be transferred. Such a memory organization is referred to as 
single-access organization. If the processor and channels (channels are to be described 
in later chapters) request memory access at the same time, the memory cycle has to be 
allocated on a priority basis. 

If a memory organization allows more than one access at the same time, it is a 
multiple-access organization; such an organization is illustrated in Fig. 7.4. It is 
basically a module organization. There are four data buses which allow memory 
access by the processor and three channels simultaneously. The data switch networks 
connect the data buses from the processor and the channels to the buffer registers 
of the four memory modules, while the address switch networks connect the address 
buses to the four address registers of these four modules. The simultaneous accesses 
require that there be no conflict in accessing the same module by more than one 
processor and channel. If such conflicts occur, priority networks are required as indi- 
cated in Fig. 7.4. With the four data buses, the data transfer rate can become as many 
as four times the transfer rate for the single access organization in Fig. 7.1. These 
data and address switch networks, as well as the priority network, are the equivalent 
of the distributed crossbar switch described above. 


7.1.5 Types of Random Access Memories 


Random access memory can be classified according to the functional character- 
istics or physical characteristics of the memory. According to the functional charac- 
teristics, there are file memory, buffer memory, stack memory, program memory, 
microprogram memory, channel program memory, channel control memory, and so 
forth. According to the physical characteristics of speed and capacity, there are large 
capacity memory, main memory, local memory, buffer memory, control memory, 
and register array; these form a hierarchy of memories of increasing speed and 
decreasing capacity. Most of the above memories are magnetic core memory. There 
are applications where the main memory or the control memory is a magnetic thin 
film memory and the buffer memory is a semiconductor memory. Most of the micro- 
program memories of today are read-only memories which are usually not magnetic 
core memories. 

As an illustration, Table 7.1 shows the characteristics of the IBM System/360 
memories for models 30, 40, 50, 65, 75, 85, and 91. (These values are subject to change 
by the IBM.) These are magnetic-core memories. The characteristics listed in Table 
7.] are: memory cycle and access time, memory interleaving, data transfer width, 
memory maximum and minimum capacities, and clock cycle time. In general, as the 
model number increases, the clock cycle time, the memory cycle time, and the memory 
access time decrease (except for the clock cycle time and the memory cycle time 
for model 40 and the memory cycle and access times for model 85); the number of 
interleaving increases; the data transfer width increases (except for model 91); both 
the maximum and minimum capacities increase. 

Among these models, model 85 is the most recent. Although it employs a memory 
with a longer memory cycle time and a longer access time (but with a larger data 
transfer width) than those of model 91, it makes use of a high-speed buffer which is 


[з= ЕСЕ 
Processor Гг т 


re 
Channel A Г 
oer oz] 


Channel B 


= 
ee eee 
ee а 
202-2221 

Channel C. C Z] 
неке тын 


Coders 


Address 


buses 


Address Address Address Address 
switch switch switch switch 
network network network network 


Array for 
module #2 


Array for 
module #1 


Array for Array for 
module #3 module #4 


Data i 


buses 
Channel A Data Data Data Data 
Switch Switch switch Switch 
network network network 


Channel B network 
Channel C 


Fig. 7.4 A module organization with multiple accesses 


262 


Sec. 7.2 Memory Addressing 263 


TABLE 7.1 Characteristics of Some IBM System/360 Memories 


ge ——— 
IBM SvsrEM/360 MopxLs 


CHARACTERISTICS 30 40 50 65 75 85 9] 
———————— ——— —— ee ee UR 
Clock cycle time (usec) 0.5 0.625 05 0.2 0.195 0.08 0.06 
Memory cycle time (usec) 2.0 2.5 20 0.75 0.75 1.04 0.75 
Memory access time (мес) 1.0 1.0 10 0.6 0.585 0.88 0.60 
Memory interleaving No No No 2-мау 2-4 мау 2-4 мау 8-16 way 
Data transfer width (bytes) 1 2 4 8 8 16 8 


Memory max. capacity (bytes) 65K 256K 256K 1,024K 1,024K  4,096K 4,096K 
Memory min. capacity (bytes)t 8K 16K 64K 128K 256K. 512K 1,024К 


TK represents a multiplier of 1.024 


a very fast memory with a memory cycle time of 80 nanoseconds. Memory organiza- 
tion which makes use of a high-speed buffer is to be described subsequently. 


7.2 Memory Addressing 


Memory addressing refers to the scheme by which the information stored in the 
memory is addressed. The unit of information that can be addressed is called the 
addressable unit. An addressable unit can be a bit, byte, word, halfword, fullword, 
double word, block, sector, or page. The ability to address one or more addressing 
units of a memory has great significance in the use of the computer. 

Some considerations in selecting the addressing scheme for a stored-program 
computer include: the use of binary address, the selection of the addressable unit, 
the ability to address a very large main memory, the bit efficiency of programs, the 
simplicity of loading and relocation, and the feasibility of dynamic allocation and 
relocation. 

The use of the binary address instead of other addresses such as decimal address 
avoids the need for number translation, because binary address is most economical 
in hardware to address today's random-access memories. The addressable unit is 
commonly chosen to be a byte or a word. The selection of a byte as the addressable 
unit allows easier addressing of a string of symbols, while the selection of a word as 
the addressible unit is more convenient for addressing fixed-point and floating-point 
numbers. The selection of a block as an addressable unit permits easier addressing 
of data for the transfer between the main memory and the external memory (such 
as a magnetic disk memory). The ability to address a large-capacity memory has 
become an important characteristic because the use of a large-capacity memory 
reduces the number of references to the external memory. Addressing to the bit 
requires more bits in a memory address; the low frequency of its usage makes it hard 
to justify. Bit efficiency refers to the effective use of the bits stored in the memory. 
Instructions with long addresses occupy a significant part of the memory and thus 
make the bit efficiency low. Loading programs into the computer is becoming an 


264 Chap. 7 MEMORY ORGANIZATION 


increasingly complex task. During loading, address modification of the instructions 
in the program is required. Dynamic allocation and relocation of programs in the 
memory occur often in a time-shared, multiprogramming application. Addressing 
schemes that require no address modification during loading and can dynamically 
allocate and relocate the programs have become available; these addressing schemes 
require additional hardware. 

In this section, various addressing schemes are described by using the procedural 
version of the CDL. These addressing schemes are: immediate addressing, direct 
addressing, indirect addressing, index addressing, relative addressing, base addressing, 
implicit addressing, and register addressing. 


7.2.1 Immediate, Direct, and Indirect Addressing 


In the immediate addressing, the address field of an instruction contains an 
operand itself, not an address. This operand is usually an integer constant, not a 
floating-point number. Let M be the memory with address register AR and buffer 
register D in addition to program counter PC and accumulator AC. With this con- 
figuration, the immediate addressing is illustrated as follows: 


Comment, configuration for immediate, direct, and indirect addressing (7.1) 


Register, AR(0-14), $address register 
D(0-35), $buffer register 
PC(0-14), $program counter 
I, Sindirect addressing flag 
AC(0-35), $accumulator 
OP(0-1 1), $op-code register 
Subregister, D(ADDR)—D(21-35), $address field of the instruction 
Memory, M(AR)=M(0-32677,0-35), 
Comment, fetch the instruction (7.2) 
AR<PC; $transfer next instruction address 
to AR 
D<M(AR); $read the instruction out of 


memory into D 
Comment, immediate addressing 
АС 0; $clear AC 
AC(21-35)À— D(ADDR); $transfer the operand into АС 
In the above, the accumulator is first cleared and the operand then transferred to the 


AC because the operand has only 15 bits. 
In the direct addressing, the address field of an instruction contains an operand 


Sec. 7.2 Memory Addressing 265 


address at which the operand is fetched from the main memory. With the configura- 
tion in (7.1), the direct addressing is illustrated as follows: 


Comment, fetch the instruction (7.3) 
AR<PC; $transfer next inst. address to AR 
D«—M(AR); $read the inst. out of memory into D 


Comment, direct addressing 
AR<D(ADDR); Stransfer operand address to AR 
D-—M(AR); $read operand out of the memory 
АСР; $transfer the operand to АС 


In the indirect addressing, the operand is first fetched from the memory as in 
the case of direct addressing. However, the address field of this operand contains 
another operand address at which the second operand is fetched. This process can 
continue as many times as required. To be specific, let register OP store the OP code 
and register I store the bit which, when 1, indicates the indirect addressing. With 
these additional elements, the indirect addressing is illustrated as follows: 


Comment, fetch the instruction (7.4) 
AR<PC; $transfer next instruction address to AR 
D«—M(AR); $read the instruction out of memory into D 

Comment, indirect addressing 
OP<D(0-11); $store op-code in OP 
1<-0(12); $store indirect addressing flag in I 
AR<D(ADDR); $transfer address to AR 
D«—M(AR); $read operand out of the memory 


/X/ IF (=1) THEN (AR<D(ADDR)) ELSE (GOTO Y); 
IF (I=1) THEN (D<-M(AR)); 
IF (I=1) THEN (I—D(12), GOTO X); 

IY] ACD; 


In the above, bit D(12), when 1, indicates indirect addressing. After an operand 15 
fetched, another operand fetch is required if bit D(12) of the operand contains a |. 
If bit D(12) is 0, the operand is the operand required. 

If only two operand fetches are allowed, this addressing is referred to as single- 
level indirect addressing; if two or more are allowed, it is referred to as multiple-level 
indirect addressing. 

In summary, the immediate addressing requires no operand fetch from the 
memory. The direct addressing requires one operand fetch. The indirect addressing 
requires two or more operand fetches. 


266 Chap. 7 MEMORY ORGANIZATION 


7.2.2 Indexed Addressing 


Indexed addressing makes use of an index register where an increment (or a 
decrement) is stored. In the indexed-addressing, the effective address is computed 
and then transferred to the address register for fetching the operand. The effective 
address is the sum of the contents of the chosen index register and the contents of the 
address field of the instruction. Let array-register XR be the three index registers. 
With these additional registers, the indexed addressing is illustrated below. 

Array-register, ХВ(1-3,0-14), (7.5) 

Comment, fetch the instruction 

AR<PC; $transfer next instruction address to AR 

D<—M(AR); $read the instruction out of memory into D 

Comment, indexed addressing 

IF (D(18-20)=0) THEN (AR-—D(ADDR)), 

IF (Г(18-20)--1) THEN (AR-—D(ADDR) add XR(1,)), 

IF (D(18-20)—2) THEN (AR-—D(ADDR) add XR(2,)), 

IF (D(18-20)—4) THEN (AR -—D(ADDR) add XR(3,)); 

D<—M(AR); 

АСР; 
The bits D(18-20) contain the index field which specifies the index register for index- 
ing. If the index field contains 0, subregister D(ADDR) is transferred to register AR. 
If bits D(18-20) contain 1, 2, or 4, then the contents of index registers XR(1,), XR(2,), 
ог XR(3,) are added to the contents of subregister D(ADDR) to give the effective 
address. The use of indexed addressing eliminates the address-modification instruc- 
tions in a loop and makes the repetition of instructions on different sets of similar 
data extremely convenient. 

Indexed addressing can be implemented with incrementing (or decrementing). 
Іп this case, after the effective address is formed and transferred to register AR, the 
contents of the index register are incremented by 1. This facilitates the use of the index 
register for loop control. It is illustrated by the following statements. 

Comment, fetch the instruction (7.6) 

AR<PC; $transfer next instruction address to AR 
D—M(AR); $read the instruction out of memory into D 
Comment, index addressing with incrementing | 
IF (D(18-20)—0) THEN (AR —D(ADDR)), 
IF (D(18-20)—1) THEN (AR-—D(ADDR) add XR(I,), 
XR(l,)-——countup XR(1,)), 


Sec. 7.2 Memory Addressing 267 


IF (D(18-20)—2) THEN (AR<-D(ADDR) add XR(2,), 
XR(2,)<countup XR(2,)), 

ГЕ (0(18-20)=3) THEN (AR<-D(ADDR) add XR(3,) 
XR(Q3,)-——countup XR(3,)); 

D<—M(AR); 

AC<D; 


$ 


Indexed addressing can also be implemented in conjunction with indirect address- 
ing. In this case, it provides for automatic stepping of a pointer through sequential 
elements of a table of operands. 

Another extension of indexed addressing is multiple indexing which permits more 
than one increment to be added to form an effective address. Multiple indexing can 
be accomplished by making the index field maintain the address of a memory word 
which contains more than one address. These addresses may then be the addresses 
of memory words which are used as index registers. 


7.2.3 Relative Addressing 


Relative addressing specifies the operand address of an instruction which is 
relative to the address in the program counter where the current instruction address 
is stored. With the above configuration, the relative addressing is illustrated below. 


Comment, fetch the instruction (7.7) 
AR<PC; $transfer next instruction address to AR 
D<—M(AR); $read the instruction out of memory into D 

Comment, relative addressing 
IF (D(13)=1) THEN (AR-—D(ADDR) add PC) ELSE (AR 

<-D(ADDR)); 
D—M(AR); 
AC<D; 
In the above, bit D(13), when 1, indicates relative addressing. The address is the sum 


of the contents of the program counter and the contents of the address field of the 
instruction. 


7.2.4 Base Addressing 


Base addressing specifies the operand address of an instruction with reference 
to the address in the base address register BAR. Thus, it is relative addressing with 
respect to the address in the base address register instead of the address in the pro- 


268 Chap. 7 MEMORY ORGANIZATION 


gram counter. With the above configuration, the base addressing is illustrated 
below. 


Register, BAR(0-14), $base address register (7.8) 
Comment, fetch the instruction 

AR<PC; $transfer next instruction address to AR 

D<M(AR); $read the instruction out of memory into D 
Comment, base addressing 

IF (D(14)=1) THEN (AR-—D(ADDR) add BAR) ELSE (AR 

<-D(ADDR)); 
D<—M(AR); 
In the above, bit D(14), when 1, indicates base addressing. The address is the sum of 


the contents of the base address register and the contents of the address field of the 
instruction. 


7.2.5 Register Addressing 


Operands are normally stored in the memory. Some operands (and some inter- 
mediate results) are preferred to be stored in some available registers in order to 
speed up the processing. Register addressing addresses the operand in such a register. 
Let array-register GR be the two general-purpose registers. The register addressing 
is now illustrated below. 


Array-register, GR(1-2,0-35) $general-purpose registers (7.9) 
Register, RA(0-1) $storing bits D(15-16) 
Comment, fetch the instruction 
AR<PC; $transfer next instruction address to AR 
D«—M(AR); $read the instruction out of memory into D 


Comment, register addressing 
AR<D(ADDR); $transfer second operand address to AR 
RA<D(15-16); $сопіго! bits 
D«—M(AR); $fetch the second operand 
IF (RA=2) THEN (GR(1,).—GR(1,) add D), 
IF (RA=3) THEN (GR(2.) - GRQ,) add D); 
There are two operands: one in register GR(I,) or GR(2,) and the other located 


by the memory address in subregister D(ADDR). In order to illustrate the register 
addressing, it is assumed that these two operands are added. Bit D(I5), when 1, 


Sec. 7.3 Memory Stack 269 


indicates register addressing. Bit D(16), when 0, indicates register GR(1,); otherwise, 
register GR(2,). 

If there is only one general-purpose register and if this register is the accumulator, 
then there is no need for explicitly addressing the accumulator. Such an addressing 
is known as implicit addressing. 

If the registers above contain operand addresses instead of operands, this may be 
referred to as indirect register-addressing. Furthermore, it is possible to additionally 
provide incrementing in conjunction with indirect register-addressing. With the above 
configuration, the indirect register-addressing with incrementing is illustrated below. 


Comment, fetch the instruction (7.10) 
AR<PC; $transfer next instruction address to AR 
D<—M(AR); $read the instruction out of memory into D 
Comment, indirect register-addressing with incrementing 
IF (D(16-17)=1) THEN (AR<-GR(1,21-35), 
GR(1,21-35)<-countup СЕ(1,21-35)), 
IF (D(16-17)—3) THEN (AR-—GR(2,21-35), 
GR(Q,21-35) —countup GR(2,21-35)); 
D—M(AR); 
ACD; 
In the above, bit D(17), when 1, indicates the indirect register-addressing with incre- 
menting. The contents of register GR(1,21-35) ог GR(2,21-35) аге transferred to 


address register AR, and register GR(1,21-35) or GR(2,21—35) is incremented by one 
according to whether bit D(16) is 0 or 1, respectively. 


7.3 Memory Stack 


A stack is a linear list of data elements in which insertions, deletions, and often 
accesses of the data elements are made at one end. It is also called a push down list 
or a last-in first-out (LIFO) list. The stack is a powerful means for handling a nested 
structure of data. 

There are other linear lists such as queues and deques. There are other data 
structures such as trees and multi-linked structures. These data structures can all be 
implemented by hardware; the following implementation of the a stack serves as 
an example. 


7.3.1 Stack Organization 


The stack implemented by hardware is usually an assigned area of memory. 
A stack organization is shown in Fig. 7.5 where the top element, the second element, 
...,and the bottom element are shown. As indicated, the top element is pointed by 


270 Chap. 7 MEMORY ORGANIZATION 


Insert Delete 


N И 


- Top element 


Second element 
Third element 


Bottom element 


Fig. 7.5 Stack organization 


stack pointer, SP. Let the assigned area in the memory be 1,024 words, AR be the 
address register, B be the buffer register, and SP be the register for storing the stack 
pointer. The configuration of the stack organization is described as follows. 


Comment, description of a stack organization (7.11) 
Register, AR(0-14), $address register 

SP(S,09-), $stack top element pointer 

SM(0-14), $stack bottom element pointer 

В(0-35), $buffer register 


OVERFLOW, $stack overflow indicator 
UNDERFLOW, $stack underflow indicator 
Memory, М(АК)=М(0-32677,0-35) 


In the above, bit SP(S) is used to indicate stack underflow or stack overflow, as will 
be shown. 


7.3.2 Stack Operation 


The three important stack operations are the insertion of an element, the deletion 
of an element, and the access of an element. For an elément to be inserted onto the 
stack, the top element is pushed down to become the second element and the new 


Sec. 7.3 Memory Stack 271 


element is placed at the top of the stack to become the top element. For an element 
to be deleted from the stack, the top element is removed from the stack and the original 
second element now becomes the top element. For the top element to be accessed, 
its contents, while remaining unchanged, are copied to another register. These stack 
operations are now described by the following statements: 


Comment, description of stack operations 


Comment, push-down operation (7.12) 
SP«—countup SP; 
IF (SP(S)=1) THEN (OVERFLOW —1, СОТО X); 
AR<SM add 0-S(0-9); 
M(AR)—B; 


Comment, pop-up operation (7.13) 
AR<SM add 0-5(0-9); 
В«-М(АК); 
SP<countdn SP; 
IF (SP(S)=1) THEN (UNDERFLOW —1, GOTO X); 


Comment, access of the top element (7.14) 
AR<SP; 
B-—M(AR); 

[X] END 


In the above push-down operation, when the contents of stack pointer SP exceed 
1,023 after a countup micro-operation, bit SP(S) becomes 1 and the stack is over- 
flown. Similarly, in the pop-up operation, when the contents of stack pointer SP 
becomes negative after a countdown micro-operation, bit SP(S) again becomes 1 
and the stack is underflown. 


7.3.3 Stack Adjustment 


During the execution of certain instructions, it is desirable that the accumulator 
becomes the top element of the stack and the buffer register of the memory, the 
second element. To accomplish this configuration, before the execution of these 
instructions, the two top elements of the stack should be popped up from the memory 
into registers A and B. After the execution of these instructions, the contents of the 
two top elements in registers А and B are pushed down into the stack in the memory. 
Such operations, called stack adjustments, can be implemented by hardware and 
operated automatically. An example is the stack in Burrough's B5500 computer (1). 

As an illustration of the hardware implementation, the push-down stack adjust- 
ment is now described. Let R be the register where the upper-limit plus-one of the 


272 Chap. 7 MEMORY ORGANIZATION 


stack in the memory is stored. Let Y(1—2) be the register for stack control. When 
Y(1) is 1, it indicates that register A is not empty; when Y(2) is 1, it indicates that 
register B is not empty. Let SP be the stack pointer, A be the accumulator, AR be 
the address register, B be the buffer register, OVERFLOW be the stack overflow 
indicator, and M be the memory. This configuration and the push-down operation 
are described below. idi 


Comment, description of push-down stack adjustment operation (7.15) 
Register, AR(0-14), $address register 

В(0-47), $buffer register 

А(0-47), $accumulator 

SP(0-14), $stack pointer 

R(0-14), $stack limit pointer 

Ү(1-2), $stack control bits 


OVERFLOW,  $stack overflow indicator 

Memory, M(AR)=M(0-32677,0-47), 

IW] IF (Y «0) THEN (GOTO W); 
IF (Y(2)=0) THEN (СОТО Z) ELSE (SP<—countup SP); 
IF (SP=R) THEN (OVERFLOW —1, GOTO W); 
AR-SP; 
M(AR)<B, Y(2)<0; 

/Z/ IF (Y(1)=0) THEN (GOTO W) ELSE (SP<countup SP); 
IF (SP=R) THEN (OVERFLOW —1, GOTO W); 
ВА, AR—SP; 
M(AR)<B, Y(1)—0, GOTO W; 


As described above, if register Y is 0, it indicates that the contents of registers 
A and B have been pushed down and the operation is completed. Otherwise, if bit 
Y(2) is 1, the contents of register B need to be pushed down into the stack in the 
memory. In this case, stack pointer SP is incremented by 1 and then tested to see 
whether it exceeds the upper limit which is stored in register R. If the upper limit is 
exceeded, register OVERFLOW is set to 1. If the upper limit is not exceeded, the con- 
tents in register B are pushed down to the stack and bit Y(2) is reset to 0. At this time, 
Y(1) is tested for 0. If it is 0, the operation is completed; otherwise, it indicates that 
the contents of register À are to be pushed down. In this case, stack pointer SP is 
again incremented by 1 and again tested to see whether it exceeds the upper limit. 
If the upper limit is not exceeded, the contents in register A are first transferred to 
register B and then pushed down to the stack. Bit Y(1) is reset to 0. The operation 
is now completed. ` 


Sec. 7,4 Associative Memory 273 


7.4 Associative Memory 


The random address memory has the attributes that the memory is solely a 
storage device and each memory word is located by an address. It may be called a 
location addressable memory. An associative memory, in addition to being a storage 
device, has the attributes that the memory words are addressable by content and the 
memory has a parallel-search capability. Content addressability means that a memory 
word (or words) can be accessed by matching a selectable field of a given search word 
to be called an argument, instead of by an address in a random access memory. (An 
associative memory can also be location addressable.) Parallel-search capability means 
that the search word can be compared with all the words of the memory. For these 
reasons, associative memory is also known as parallel-search memory, or content- 
addressable memory. 

In some of the associative memories, all the bits of all the words of the memory 
are simultaneously compared with an argument; this is called word-parallel search. 
Instead of the word-parallel search, one bit of all the words is compared simulta- 
neously with the respective bit of the argument at one time; this is called bit-parallel 
search. Bit-parallel search is slower than the word-parallel search when a match-for- 
equality (or inequality) operation is performed. On the other hand, an associative 
memory with a bit-parallel search is less costly and more practical for an associative 
memory of a larger capacity. 


7.4.1 Memory Organization 


The organization of an associative memory is shown in the block diagram in 
Fig. 7.6. The memory array is word-organized. Associated with the memory array 
are registers A, К, B, M, MHEQ, and MHUQ. Register A is the argument register; 
register K is the mask register; register B is the buffer register; register M is the match 
register; and registers MEQ and MIEQ are for initiating the match-for-equality and 
match-for-inequality operations, respectively. The above associative memory can be 
described by the following statement: 


Comment, description of an associative memory (7.16) 
Register, А(0-35), $argument register 
К(0-35), $mask register 
В(0-35), $buffer register 
М(0-255), $match register 
MEQ, $match-for-equality 
control bit 
MIEQ, $match-for-inequality 


control bit 


274 Chap. 7 MEMORY ORGANIZATION 


Argument register A 
Mask register K 


Memory 
control 


Memory array 


Buffer register B 


Fig. 7.6 Organization of an associative memory 


Match register M 


Asso-memory, MEM(A,K,B)=MEM(0-255,0-35), Sassociative memory 
Clock, Р $single-phase clock 
Terminal, ?--К(0)--К(І)--...--К(35), 


То perform а match-for-equality operation, the argument is placed in register 
A and the mask in mask register K. A parallel comparison for equality is then made 
between the unmasked bits of the argument and their respective bits of all the words 
of the memory array. Those words that are matched with the masked argument are 
marked by | in the respective bits of match register M. These micro-operations can 
be described by the following micro-statements, 


/МЕО«Р/ M(0)<—((MEM(0,0)@A(0))+ K(0)')*(MEM(0, )@A()+ K(1)’) 
HT *((MEM(0,35)@A(35)) 
+K(35)‘)#Z, 


Sec. 7.4 Associative Memory 275 


M(1)—(MEM(1,0)©A(0))+ K(0)’)*(MEM(I,I)@A(1))+-K(1)’) 
TENPE ee eae *((MEM(1,35)@A(35)) 
+K(35)’)*Z, (7.17) 


M(255)<—((MEM(255,0)@A(0))+ K(0)) ((MEM(255, )@A(1)) 
+K(1)’) 
ТТТ КЕН *((MEM(35,35)(9A(35)) 
T K(35)')*Z, 


The above micro-statements are too lengthy to be written. They can be shortened 
in the following manner, 


/MEQ*P/ M(0)—«(MEM(0,0-35)(9A(0-35)) + K(0-35)’))#Z, (7.18) 
M(1)<—*((MEM(1,0-35)@A(0-35))+ K(0-35)))«Z, 


M(255)<—*((MEM(255,0-35)@)A(0-35))-+ K(0-35)’))#Z, 


Notice that operator “ж? is placed immediately after the arrowhead in order to indicate 
that this operation applies to as many factors as the subscript indicates. The above 
micro-statements may be further shortened by using operator “ж” twice; thus, we have, 


/МЕО«Р/ M(0-255)—*«((MEM(0-255,0-35)9A(0-35))--K(0-35))*Z, (7.19) 


The above micro-statement shows that the contents of the match register M store 
the equality indications of the parallel comparison operation. The match-for-inequal- 
ity operation is similar except that, at the end of the comparison, the 1’s in match 
register M indicate the unmatched words. It should be noted that the above com- 
parison can result in zero, one, or more matches or nonmatches. If every bit position 
of the argument register can be chosen to compare with the respective bits of the 
memory array, this is called fully associative; otherwise, it 15 called partially asso- 
ciative. 

In addition to these match-for-equality and match-for inequality operations, 
there are a number of other basic operations. The match-read and unmatch-read 
operations read the matched and unmatched word or words out of the associative 
memory into the buffer register, respectively. The match-write and unmatch-write 
operations write a word into the location of a matched or unmatched word or words, 
respectively; only the unmasked field of the word is written into the memory. The 
clear-all-words operation clears all the words of the associative memory including 
the tag bit of each word. (The tag bit indicates whether the word is vacant or occu- 
pied.) The mark-all-bits operation is similar to the clear-all-words operation except 
that 175 are written into the unmasked bit positions of the associative memory array. 
The write-on-next-vacancy is an operation to write a word into the memory at the 
next vacancy. (Next vacancy means the first zero in the tag bits of the words in the 


276 Chap. 7 MEMORY ORGANIZATION 


memory.) The find-match-count and find-nonmatch-count operations determine the 
number of 1’s or 0’s in the match register M, respectively. Additionally, there are two 
bit count operations to be described subsequently. 


7.4.2 Match on a Numerical Argument 


Match-on-a-numerical-argument operations, while performing parallel compari- 
sons, make use of the numerical property such as equality, larger-than, or between- 
limits. Seven such operations are described below. They are: match-on-larger-than, 
match-on-smaller-than, match-on-maximum, match-on-minimum, match-on-between- 
limits, match-on-next-larger, and match-on-next-smaller operations. In the following, 
the term “argument bit” always means an argument bit which has not been 
masked out by the mask (1.е., the mask bit is 1). 

In performing these match-on-numerical-argument operations, additional regis- 
ters other than those shown in Fig. 7.6 are required. The configuration of these 
registers and other computer elements are described below: 


Comment, configuration of an associative memory (7.20) 
Register, A(0-35), $argument register 
K(0-35), $mask register 
В(0-35), $buffer register 
М(0-255), $match register 
K1(0-35), $mask storage register 
K2(0-35), $mask storage register 
K3(0-35), $special mask register 
K4(0-35), $special mask register 
А2(0-35). $argument storage register 
С(0-5), $counter 
MEQ, $match-for-equality 


control bit 
Asso-memory, МЕМ(А,К,В)-- МЕМ(0-255,0-35), Фаззослайуе memory 


In the above, registers К] and K2 store the mask, register A2 stores the argument, 
and registers K3 and K4 store special masks. 


7.4.2.1 | Match-on-larger-than (smaller-than) 
operation 


The match-on-larger than (smaller than) operation finds those memory words 
whose magnitudes are larger (smaller) than the argument in register А. The algorithm 
is described below. s 


Sec. 7.4 Associative Memory 277 


(a) Scan the argument bits from the leftmost to the rightmost bit, and identify the first 
0 (1) which is called the target bit. 


(b) Form a new argument by replacing the target bit by a 1 (0). 
(c) Form a new mask by masking out all argument bits to the right of the target bit. 


(d) Perform a match-on-equality operation for the new argument. (The matched words 
are marked by 1’s in register M and are accumulated in register M.) 


(е) Restore the target bit to state 0 (1). 


(Г) Repeat (a) by choosing the next 0 (1) as the target bit. Retain the argument bit to 
the left and mask out the argument bits to the right of the target bit. Repeat (b)-(e). 


(g) Repeat (f) until no more target bit can be located. The memory words which are 
larger than the argument are now located by the 1’s in match register M. 


The above algorithm is presented in the sequence chart in Fig. 7.7, where the 
argument is initially stored in register A2 and the mask is initially stored in registers 
КІ and K2. Note that operator shr rightshifts logically (i.e., a 0 is inserted at the 
left) and operator shra rightshifts arithmetically (i.e., the leftmost bit remains un- 
changed). The target bit in step (a) is identified by the following expression, 


K2(0)* A2(0)' 


The new target bit is obtained by circularly leftshifting registers K2 and A2 after 
each iteration. The change of the target bit from 0 to 1 in step (b) is accomplished by 
the following micro-operation, 


А<КЗ--А 


where register КЗ stores а special mask for extracting the target bit. This mask is 
logically rightshifted after each iteration. The new mask in step (с) is formed by a 
special mask in register K4 which extracts certain mask bits in register K1 as described 
by the following micro-operation, 


K<K1*K4 


The special mask in register K4 is rightshifted arithmetically after each iteration. 
The match-on-equality operation in step (d) is initiated by setting register MEQ to 
1, or 


МЕО<1 


The change of the target bit from 1 back to 0 in step (е) is achieved by the following 
micro-operation, 


A—K3'*A 


This process is repeated for each target bit until counter C reaches 35. At that time, 


Entry 


K3«400,000,000,000 
К4—400,000,000,000 


K2(0)*A2(0)'71 


А-КЗ+А, 
К-К1*К4, 


А+КЗ'+А 


K2<cil K2, 
A2<cil A2, 
K4<shra K4, 
K3<shr K3, 


End 


Fig. 7.7 Sequence chart for match-on-larger-than-argument 
operation Ё 


278 


Sec. 7.4 Associative Memory 279 


the matched words are those marked by 1’s in register M, and the argument and the 
mask in registers А and К are restored. 

The sequence for the match-on-larger-than-argument operation is now described 
by the following CDL procedural description. 


Comment, match-on-larger-than-argument sequence (7.21) 
C—0, M—0, A2—A, K2—K, K1—K, 
K4-—400,000,000,000, K3-—400,000,000,000; 

/X/ IF (K2(0)*A2(0)’+1) THEN (GOTO Y); 
А<-К3--А, K —K1*K4; 

MEQ-1; 
A—K3'*A; 

ГҮ! K2<cil K2, A2—cil A2, K4<-shra K4, K3<-shr КЗ; 
IF (C35) THEN (C<countup С, GOTO X); 
A«—A2, K —K2; 

END 


7.4.2.2 Match-on-between-limits operation 


The match-on-between-limits operation finds those memory words whose num- 
bers lie between two magnitude limits, an upper limit and a lower limit. The algorithm 
is shown in the flowchart in Fig. 7.8. As shown, two match operations are required. 
First, reset match register to 0 and perform a match-on-larger-than-argument opera- 
tion with the lower limit as the argument. Next, use the resulting contents of the 
match register as the initial condition, and perform a match-on-smaller-than-argument 
operation with the upper limit as the argument. The 1’s in the match register after 
these two operations indicate the memory words whose magnitudes are between the 
two limits. 


7.4.2.3 Match-on-maximum (minimum) operation 
The match-on-maximum (minimum) operation finds those memory words which 


have a numerical maximum (minimum). The mask indicates the field of the argument 
while the argument itself is zero. The algorithm is described below. 


(a) Scan the argument bits from the leftmost to the rightmost. Choose the leftmost 
argument bit as the target bit. 

(b) Form a new argument by setting the target bit to 1 (0) and by masking out all 
other argument bits to the right of the target bit. 

(c) Perform a match-on-equality operation for the new argument. 

(d) If there are one or more matches, store the contents of match register M in match 
storage register M2. If there is no match, do not store the contents of match register 
M but restore the target bit to O (1). 


280 Chap. 7 MEMORY ORGANIZATION 


Entry 


Argument Lower limit 
Do match-on-larger-than- 
argument operation 
Argument*- Upper limit 


Leave M unchanged and do 
match-on-smaller-than- 
argument operation 


The 's in M indicate the 
memory words which are 
between the upper and lower 
limits 


End 


Fig. 7.8 Flowchart showing the algorithm for match-on- 
between-limits operation 


(e) Repeat (a) by choosing the next leftmost argument bit as the target bit. Retain the 
argument bits to the left and mask out argument bits to the right of the target bit. 
Repeat (b), (c), and (d). 

(f) Repeat (e) until there is no more target bit. The maximum (minimum) is the memory 
word which has a 1 in match register M. Note that there can be more than one 
memory word which contains the same maximum (minimum). Also note that the 
final argument is identical to the maximum (minimum). 


The above algorithm is drawn into the sequence chart in Fig. 7.9. The argument 
is 0. The mask is stored in mask storage register K2. The target bit is determined by 
bit K2(0). Since the sequence chart in Fig. 7.9 is similar to that in Fig. 7.7, this se- 
quence chart is not further discussed. 

The sequence for the match-on-maximum operation is now described by the 
following procedural version of the CDL. - m 


Comment, match-on-maximum sequence (7.22) 
C—0, М<-0, M2—0, A—0, K2—K ; 


K«400,000,000,000 
K3«400,000,000,000 


ACK3'*A 


K2-cil K2, 
A2<cil A2, 

K<shra К, 
K3<shr КЗ, 


End 


Fig. 7.9 Sequence chart for match-on-maximum operation 


281 


282 Chap. 7 MEMORY ORGANIZATION 


K—400,000,000,000, K3<—400,000,000,000; 
/Х/ ЛЕ (K2(0)=0) THEN (GOTO Y); 

А<КЗ-+А; 

MEQ<1; 

IF (M—0) THEN (A —K3'*A) ELSE (M2—M); 
IY] K2-cil K2, A2—cil A2, K—shra К, K3eshr КЗ; 

IF (C435) THEN (C-—countup C, GOTO X); 

М<М2, A—A2, K—K2; 

END 


7.4.2. Match-on-next-larger (next-smaller) 
operation 


The match-on-next-larger (next-smaller) operation finds those memory words 
which are the next larger (next smaller) than a given number. The algorithm is shown in 
the flowchart in Fig. 7.10. As shown, two operations are required. First, reset the 


Entry 


Do match-on-larger-than- 
argument operation 


Leave M unchanged and do 
match-on-minimum operation 


The 1's in M indicate the 
memory words which are the 
next larger than the argument 


End 


Fig. 7.10 Flowchart showing the algorithm for match-on-next- 
larger operation 


match register to 0 and perform a match-on-larger-than (smaller-than) operation by 
using the given number as the argument. Next, use the resulting contents of the 
match register as the initial condition and perform a match-on-minimum (maximum) 
operation. The 1’s in the match register after these two operations indicate the memory 
word or words which are the next larger than the given number. 


Sec. 7.4 Associative Memory 283 


7.4.3 Match on а Boolean Argument 


In the previously described match-for-equality operation, a parallel comparison 
for equality between the argument bits and their respective bits of the memory array 
is performed. This comparison for equality makes use of the logical equivalence 
operation. In a match-for-inequality operation, the parallel comparison is for in- 
equality; this comparison uses the logical-exclusive-or operation. If another logical 
operation such as logical-and, logical-or, logical-nor, logical-nand, logical-andnot, 
or logical-ornot operation is chosen, this is referred to as match on a Boolean argu- 
ment. 

If the logical-and operation is chosen, then the matches give those words whose 
175 agree with the 1’s in the argument (but not vice versa). If the logical-nand opera- 
tion is chosen, then the nonmatches give those words whose 0’s agree with the 0’s in 
the argument (but not vice versa). If the logical-or operation is chosen, then the 
matches give those words whose 0’s agree with the 0’s in the argument (but not vice 
versa). If the logical-nor operation is chosen, then the nonmatches give those words 
whose 0’s agree with the 0’s in the argument (but not vice versa). 

The above matches (nonmatches) are called symmetrical matches because the 
agreement requires a bit 1 (0) in the match register and a bit 1 (0) in the argument 
register. If the agreement requires a bit 1 (0) in the match register and a bit 0 (1) in 
the argument register, it is called an unsymmetrica] match. Logical-andnot and log- 
ical-ornot operations give unsymmetrical matches. 


7.4.4 Match on a Count Argument 


There are two basic operations for bit counting. The bit-count operation counts 
the 1’s (or 0’s) of a word in the memory (located either by the content field or by an 
address if the memory is also location addressable). The bit-count-and-store operation 
additionally transfers the count in the counter into a field of the buffer register of the 
associative memory, and then stores the contents of the buffer register in the asso- 
ciative memory at the original location. Since each bit of the memory word may 
represent an attribute, the count of attributes is a useful argument for searching 
closeness in attributes. Match-on-a-count-argument operations refer to the parallel 
comparisons made on the count field of the memory array. 


7.4.4.1 Match-on-bit-count 


The match-on-bit-count of 175 (or 0’s) operation counts the 175 (or 05) of all 
the memory words and then finds the matches for a given count. This sort of match 
is useful in selecting the number of attributes as represented by the given number of 
counts. The count of 1’s (or 0’s) and the storing of the count in each memory word 
are performed by the bit-count-and-store operation; the match is performed by the 
match-on-equality operation. 


284 Chap. 7 MEMORY ORGANIZATION 


7.4.4.2 Match-on-maximum (minimum)-bit-count 


The match-on-maximum (minimum)-bit-count of 1° (0's) operation is similar to 
the match-on-bit-count operation, except that the memory word or words with the 
largest count of 175 (0’s) are found. Two operations are involved: one of them is the 
bit-count-and-store operation and the other is the match-on-maximum (minimum) 
operation. The second operation can be the match-on-next-larger (next-smaller) 
operation if desired. 


7.5 A Dynamic Loader 


When a program or a segment of a program is to be executed by a processor, 
the instructions and data of the program must first be loaded into the main memory. 
Before the loading, the loader must assign an absolute address to each word of the 
program and, if necessary, modify the address in the address field of the instruction 
accordingly. Once the program is loaded, relocation of the program is undesirable 
because of the task of address modification. This section describes a hardware loader 
which employs an addressing scheme in such a way that no address modification is 
required as long as the program or the segment of a program has been assembled 
relative to an origin. Since the loader loads a program or a segment of the program 
into whichever memory location is available at the loading time, it is referred to as a 
dynamic loader or, more descriptively, a dynamic allocating-loader. 

The dynamic loader faces a number of problems. It must know which memory 
locations are available for allocation and loading. Are the available locations large 
enough? If so, how can they conveniently be allocated and loaded and still be protect- 
ed from other programs in the memory since these available locations are usually 
scattered rather than contiguous? How is the loader to handle the allocation and 
loading of an additional segment of a program if the segment is needed during the 
execution? In addition, there is the problem of storage release, the problem of seg- 
ment sharing, and the problem of segment relocation. This section describes the solu- 
tion to some of these problems. 

In the following, the program is assumed to consist of one or more segments, 
and the origin of each program or of the segment of a program is assumed to be zero. 


7.5.1 Loader Organization 


The organization of the dynamic loader is shown in Fig. 7.11 where there are a 
main memory, an associative memory, thirteen registers, and a memory bus. The 
registers are: address register AR, buffer register B, next-instruction address register 
NI, current-segment-number register CSN, register FPA for storing the first page- 
address of the available segment, register NSN for storing thé segment number of 
the new segment, register NSPN for storing the page number of the new segment, 
page counter PC, argument register A, mask register K, buffer register D, and control 
registers MR and MW. The first two registers are associated with the main memory 


Sec. 7.5 А Dynamic Loader 285 


New-segment 
page-number register | NSPN(0-4) 


Associative 
memory 


First page 


ЕРА(0-7) address register 


Argument 
register 


NI (0-14) 


Next-instruction 


address register 
3 Address 


register AR(0-14) 


Match Match 
read write 


м) [= 


Main memory 


АМ(0-511, 0-31) 


Current segment 
number register 


M(0-32767, 0-47) 


Buffer Buffer 
register. register 


Page 
counter 


New segment 


NSN(0-4) | number register 


Fig. 7.11 Loader organization 


Memory bus MBUS(0-47) 


M and the last five with the associative memory AM. The memory bus is attached 
to buffer register B. These elements are described by the following statements: 


Comment, configuration of a dynamic loader (7.23) 
Register, AR(0-14), $address register of main memory 
В(0-47), $buffer register of main memory 
NI(0-14), $next-instruction address register 
CSN(0-4), $current segment-number register 
FPA(0-7), $first page address register 
PC(0-7), $page counter 
NSN(0-4), $new segment number register 
NSPN(0-4), $new segment page number register 
А(0-31), $argument register 


К(0-31), $mask register 


286 Chap. 7 MEMORY ORGANIZATION 


D(0-31), $buffer register 
MR, $match-read control bit 
MW, $match-write control bit 
Memory, М(АВ) = М(0-32767,0-47), $main memory 
Bus, MBUS(0-47), $main memory bus 


Asso-memory, AM(A,K,D)—AM(0-511,0-31), $associative memory 

Subregister, А(Х,Ү,7,5М,5)-- А(0-7,8-15,16-23,24-28,29-31), 
K(X,Y,Z,SN,S) -K(0-7,8-15,16—23,24-28,29-31), 
D(X, Y,Z,SN,S)=D(0-7,8-15,16-23,24-28,29-31), 
NI(PA,LA)=NI(0-7,8-14), 
AR(PA,LA)=AR(0-7,8-14), 


The main memory is conventional except that it is divided into pages. The char- 
acteristics of main memory M are shown in Table 7.2. The main memory has 256 


TABLE 7.2 Characteristics of the Main Memory 


CHARACTERISTIC DESCRIPTION 
Word length 48 bits 
Page size 128 contiguous words 
Capacity 256 pages or 32,768 words 
Page address 8 bits 
Line address 7 bits 
Total address 15 bits 
Cycle time 1 microsecond 


pages, each containing 128 48-bit words located contiguously. The memory address 
consists of two parts, an 8-bit page address PA for each of the 256 pages, and a 7-bit 
line address LA for each of the 128 words in a page. The pages of the main memory 
and the two parts of the main memory address register are shown in Fig. 7.12. 

The characteristics of associative memory AM are shown in Table 7.3. The 
associative memory consists of 256 32-bit words. As illustrated in Fig. 7.12, each 
word points to one and only one page in the main memory. Each word consists of 
five fields; each field is maskable by the mask in mask register K. Each word in the 
associative memory is unique; therefore, multiple matches do not occur during 
matching. The associative memory 15 capable of performing two operations, match- 
read and match-write. The cycle time of the associative memory is 4 of the main- 
memory cycle time. И 

The word format of the associative memory word is shown in Fig. 7.13(а). There 
are three 8-bit fields for page addresses X, Y, and Z; а 5-bit field for segment number; 
and a 3-bit field for status indication. Page address X indicates the location of a page 
in the main memory, and page address Y the location of the next page. By means of 


Sec. 7.5 А Dynamic Loader 287 


Argument 
register 


Address 
register AR 


Mask 
register 


Buffer Buffer 
register register 


Address space Memory space 
(256 words of the associative memory) (256 pages of main memory) 


Fig. 7.12 Mapping from the address space to memory space 


TABLE 7.3 Characteristics of the Associative Memory 


CHARACTERISTIC DESCRIPTION 
Word length 32 bits 
Capacity 256 words 
Associativity Full associative and maskable 
Matching Single match only 
Operations (a) match-read 
(b) match-write 
Cycle time 4 microsecond for match-read 
or write 


256 pairs of page addresses X and Y, the 256 pages of the main memory can be 
linked into one segment. By means of the 5-bit segment number, the 256 pages can 
be linked into as many as 32 segments. Page address Z indicates the order of the 
page in a segment for use in operand address fetch, as will be described later. The 


288 Chap. 7 MEMORY ORGANIZATION 


Page Page Page Segment Status 
address X | address Y | address Z | number 
8 bits 8 bits 8 bits 5 bits 3 bits 


(a) 


Page Page Page Segment Br 
address X | address Y | address Z | number aus 
0 * 2 3 


=O ODMDNOMAWNHH— 
> -NNNOO-- ООО 
о &— кюю ь юэ — NWA 


— — 


Segment 
number 
registers 


WN 


(b) 


Fig. 7.13 An example of associtive memory map: (a) word 
format; (b) memory map 


status field is used to indicate the status of the page; an example of the status desig- 
nation is shown in Table 7.4. 


TABLE 7.4 An Example of Status Designation 


STATUS FIELD DESIGNATION 
000 Loaded 
001 Released 
010 Reserved 
011 Permanently loaded 
100 Available for loading 
101 Shared 
110 Segment number register 


An associative memory map is exemplified in Fig. 7.13(b), where the main 
memory has only 12 pages. Program РІ (or segment РІ of a program) is loaded in 


Sec. 7.5 А Dynamic Loader 289 


three pages with page addresses 1, 5, and 9; these pages are linked into a linked list 
of 1-5-9 and assigned with segment number 1. The first page of program РІ is located 
at page address 1. The segment number, the first page address, and the program name 
РІ are stored in the associative memory and shown at the fourth from the last line 
of the associative memory map in Fig. 7.13(b). The page at page address 9 is the last 
page of segment number 1; this is indicated by the asterisk in the page-address Y 
field of the 10th line in Fig. 7.13(b). Similarly, programs P2, P3, and P4 consist of the 
linked lists of pages, 6-4-8, 3-11-0, and 2-10-7 with segment numbers 2, 3, and 4, 
respectively. By means of the segment status designations in Table 7.4, segment 
number 4 contains the pages available for loading (referred to as available segment); 
segment number 1 is a loaded segment; segment number 2 has been released and is 
ready to become a part of available segment; segment number 3 has been permanently 
loaded (for such use as the resident part of the supervisor program). 

The last four lines in the associative memory map in Fig. 7.13(b) represent 
associative memory words which function as segment-number registers as indicated 
by 110 in their status fields. As mentioned, the first page addresses of the segments 
are stored in the address X fields. The fields for page addresses Y and Z are used to 
store the return addresses and will be described further. The segment number and 
status fields store the segment number and segment status as before. 

It should be noted that the order of the words in Fig. 7.13(b) is not necessarily 
the actual order in the associative memory; in fact, the actual order may not be known. 


7.5.2 Allocation and Loading 


Before the loader is initiated for loading a program segment to the main memory, 
the supervisor program determines that the program-segment size is smaller than 
the available-segment size, and then stores the number of pages of the new segment 
into the new-segment page-number register NSPN and the segment-number assigned 
to the new segment into the new-segment-number register NSN. The first page address 
of the available segment is in the first page address register FPA. 

The loading process consists of two parts. The first part allocates the available 
pages of the available segment to the new segment and stores the allocation in the 
form of the previously-described linear list in the associative memory. The second 
part actually loads the program segment from a backing storage, such as a disk 
memory, into the main memory. 

The loading process is shown in the sequence chart in Fig. 7.14. The sequence 
is initialized by resetting page counter PC to 0 and setting up an argument in register 
А and a mask in register К. The argument consists of the value in register FPA as 
page address X, the value of 9 as the chosen segment number for the available seg- 
ment, and the value of 4 as the status designation. The mask is set up with page 
addresses Y and Z masked out. Next-instruction address register NI is initialized by 
taking the value in register FPA as the page address and by taking the value of 0 as 
the line address. Current segment-number register CSN is initialized by taking the 
value in the new segment-number register NSN as the current segment-number. 


290 Chap. 7 MEMORY ORGANIZATION 


Entry 


РС-О, 
A(X, Y, Z, SN, $) -ЕРА-0-0-9-4, 
K(X, Y, Z, SN, $)-255-0-0-31-7, 


МКРА, LA) -FPA-O, 
CSN<+NSN, 


NSPN<countdn NSPN 


FPA<D(Y) 
D(Z, SN, S)<PC-NSN-O, 


5 NSPN=0 


bv) 
PC<countup PC 
A(X)—FPA, (A) 


Fig. 7.14(a) Sequence chart of the dynamic loader 


These two registers NI and CSN are initialized for the second part of the loading 
process. 

The first part of allocation now begins by decrementing the contents of register 
NSPN by one. Associative memory АМ match-reads the first word of the available 
segment out of the associative memory into buffer register D. Page address Y in 
register D is transferred to the first page-address register PFA. Next, a new word is 
formed in register D. This new word contains page address Z which is the page 
number in page counter PC, segment number SN taken from register NSN, and the 
status designation of O loaded status. Subregister D(Y) of the new word should 
contain the segment-termination symbol “ж” if register NSPN reaches 0. In either 
case, the new word is next match-written into the associative memory. If the new- 


Sec. 7.5 A Dynamic Loader 291 


Second part 


A(S)<0, 
K(X, Y, Z, SN, 5)<255-0-0-0-7 


NI(LA)=127 


A(X) -NI(PA) 


Interrupt 


Return 


NI(PA) -D(Y) 
NI (LA) —-countup NI(LA) 


Fig. 7.14(b) 


segment page-number register NSPN is nonzero, page counter PC is incremented 
by 1 and the argument is set up again to match-read the next linked word out of the 
associative memory and to match-write another new word into the associative 
memory. This process continues until register NSPN becomes 0 (i.e., until the last 
page of the new segment is reached). At this time, the first part of allocation is com- 


pleted. 


292 Chap. 7 MEMORY ORGANIZATION 


As an example, a segment of 600 words of instructions and data are to be allo- 
cated. Let 3 be the assigned segment number, and let the available segment (segment 
number 9) have the following sequence of pages: 56-34-103-6-89-201-153-55. The 
associative-memory map before allocation is shown in Fig. 7.15(a). The first 128 
words of the segment are assigned to page address 56; the required operations for 
this assignment are indicated in Fig. 7.15(a). The second 128 words are assigned to 
page address 34, and similarly the third, fourth, and fifth 128 words are assigned; 
40 words in the fifth page are left unused. After the allocation, the associative memory 
map is shown in Fig. 7.15(b). 

The second part of loading the segment now begins by changing the status of 
the argument to 0 and masking out page address Y, page address Z, and segment 
number SN of the argument. The line address in subregister NI(LA) is tested to 
determine whether it is 127. If it is not 127, only the line address needs to be incre- 
mented by one and the page address needs no change. If it is 127, a new page address 
needs to be fetched from the associative memory; this is done by a match-reading 
operation with the page address in subregister NI(PA) as page address X of the argu- 
ment. Page address Y in buffer register D is the new page address after the match- 
reading operation; this page address is now transferred to subregister NI(PA). After 
the match-reading operation, subregisters D(SN) and D(Y) are both tested. If sub- 
register D(SN) does not agree with the segment number in register CSN, an interrupt 
is initiated ; this comparison serves as memory protection. If subregister D(Y) contains 
the segment-termination symbol ***", the loading of the segment is completed. 

No matter whether subregister NI(LA) contains 127 or not, the memory address 
in register NI is now transferred to address register AR, and the main memory word 
now assumed ready at memory bus MBUS is stored into the main memory. The 
loading now returns and again tests the line address in subregister NI(LA). This 
process continues until the segment-termination symbol is reached. 

The allocating and loading sequences in the sequence chart in Fig. 7.14 are 
described by the following procedural statements. 


Comment, allocating sequence of the dynamic loader (7.24) 
Comment, initialization 
PC—0, A(X, Y,Z,SN,S)<-FPA-0-0-9-4, K(X, Y,Z,SN,S)<—255-0-0-31-7; 
NI(PA,LA)-—FPA-0, CSN—NSN; 
Comment, here begins the allocating loop 
/W/ NSPN<countdn NSPN, MR<1; 
FPA-—D(Y), D(Z,SN,S).—PC-NSN-0; 
IF (NSPN—0) THEN (GOTO X) ELSE (MW —1); 
PC<—countup PC, A(X)<-FPA, GOTO W; 
/Х/ | D(Y)—; 
MW -—1; 


Page Page Page Segment 
address X | address Y | address Z | number | Status 


ЕЕЕ: 


100 


о оо оо о 


Segment 
number 


registers 


(a) 


Loaded 
segments 


Available 
segment 


Segment 
number 
register 


(b) 


Fig. 7.15 An example of an associative memory тар: (a) before 
allocation ; (b) after allocation 


293 


294 Chap. 7 MEMORY ORGANIZATION 


Comment, loading sequence of the dynamic loader 
Comment, initialization 

A(S)<-0, K(X, Y,Z,SN,S)<—255-0-0-0-7 ; 
Comment, here begins the loading loop 
ГҮ! ТЕ (NI(LA)=127) ТНЕМ (GOTO 2); 

A(X)<NI(PA); 

MR<1; 

IF (D(SN)+CSN) THEN (GOTO INTERRUPT); 

IF (D(Y)—*) THEN (GOTO RETURN) ELSE (NI(PA) —D(Y)); 
/Z/ NI(LA)<countup NI(LA); 

ARCNI; 

B«-—MBUS; 

M(AR)<B, GOTO Y; 

END 


7.5.3 Instruction Sequencing 


Instruction sequencing is controlled by the next-instruction address register NI; 
the lower 7-bit constitutes the line-address counter for sequencing the line address, 
and the upper 8-bit stores the page address. The execution of a segment begins at the 
first line of the first page of the segment. Within a page, the line address is advanced 
by the line-address counter. When the line-address counter recycles from the value 
of 127 to 0, a new page needs to be obtained from the associative memory. The 
manner in which the new page and line addresses are obtained is identical to that 
described in the second part of the loading process. Notice that the actual memory 
addresses of the instructions of the segment are not known until these instructions 
are being executed. 


7.5.4 Operand Address Fetch 


The instruction being executed has a 15-bit operand address. This address is 
relative to the origin of the segment but not to the actual memory address because 
the operand addresses of the instructions of the segment have not been changed during 
the loading. Therefore, the actual operand address where the operand is stored in 
the main memory must be found. The line address of this actual operand address, 
however, is the same as the original lower 7-bit address which, now in the buffer 
register B, is transferred to subregister AR(LA). The page address of the actual 
operand address is found from the associative memory by using page address Z in 
the argument (instead of page address X as in the case of instruction sequencing). 


Sec. 7.6 Memory Buffer 295 


After the match-reading operation, page address X in buffer register D is the page 
address of the actual operand address which is then transferred to subregister AR(PA) 
for fetching the operand. 


7.5.5 Branching, Indexing, and Indirect 
Addressing 


Branching may occur when a transfer instruction is being executed. In this case, 
the operand address is an instruction address. The operand address is again fetched 
in the same manner as described above except that the line address in buffer register 
B and the page address from the associative memory are now transferred to the next- 
instruction address register, instead of the original address in address register AR. 

The operand address fetch above does not prevent the use of indexing and 
indirect addressing. To additionally provide indexing, the contents of the index 
register are added to (or subtracted from) the original relative address before the 
operand address is fetched. For indirect addressing, an operand address instead of 
an operand is fetched, and the operand address fetch is repeated. 


7.5.6 Program Return and Storage Release 


If an interrupt occurs during the execution of the instructions of a segment, 
the supervisor program places the return address (both page and line addresses) in 
the segment number register assigned to this segment as illustrated by P1, P2, P3, and 
P4 in Fig. 7.13, and then selects another segment for execution. Whenever the execu- 
tion of the original segment is resumed, the supervisor program obtains the return 
address from the segment number register assigned to the original segment. 

When the execution of a segment is completed, the supervisor program may 
decide to release the pages of the segment. The release can be done simply by linking 
the pages to the available segment. 


7.6 Memory Buffer 


Storage hierarchy in the form of a relatively fast but small main memory such as 
a magnetic-core memory and a relatively slow but large mass storage such as a mag- 
netic drum or disk storage has been employed ever since the large-scale digital com- 
puter system was first built. The basic idea behind such a storage hierarchy is to have 
the mass storage provide the necessary storage capacity and to have the main memory 
give the desired processing speed. Such a storage hierarchy is at a microsecond/milli- 
second level. 

An important development in memory organization in recent years is to extend 
the idea of storage hierarchy to a nanosecond/microsecond level. This idea, while in 
the embryonic form, was implemented in a number of computers (27, 29, 30) by 


296 Chap. 7 MEMORY ORGANIZATION 


using registers or even a very small capacity memory. It was proposed by Bloom, etc., 
(28) in 1962 and by Lee (31) in 1963 as a "look-aside memory” and by Wilkes (33) in 
1965 as a “slave memory." It was first implemented as a memory buffer in the IBM 
System/360 model 85 computer (37) whose buffer is called the “cache” and is trans- 
parent to the programmer. 

This section describes a memory buffering organization and operation similar 
to that implemented in the IBM System/360 model 85. 


7.6.1 Memory Buffering 


Conventionally, the main memory of a computer system is referenced by the 
CPU, one memory word at a time; the processing in the CPU is limited by the speed 
of the main memory. This limitation has become more critical as the capacity of the 
main memory becomes larger and larger and the CPU speed grows faster and faster. 

If a small-capacity memory, one order of magnitude faster than the main memory, 
is used as a buffer (Fig. 7.16), the processing in the CPU could be greatly speeded up 


Main memory 


Buffer 
memory 


г------------------------ 


Fig. 7.16 Memory buffering 


because the number of main memory references can be sharply reduced for the follow- 
ing reasons: 


1. The transfer from the main memory to the buffer memory can be made a block 
(i.e., several words) at a time. If the main memory has multiple-way interleaving, the 
block of words can be transferred in one main-memory cycle time. 


2. The block transfer may prefetch the desired words into the buffer memory and make 
them available to the CPU, because there is a great probability that the other words 
of a referenced block would be soon needed. 


Sec. 7.6 Memory Buffer 


297 


3. The words in the buffer memory may be used several times because of iterative loops 
and subroutines in a program, thus greatly reducing the need for many references 


from the main memory. 


In the subsequent description, the main memory and the buffer memory are 
chosen with the characteristics shown in Table 7.5. The main memory has a cycle 


TABLE 7.5 Characteristics of the Main Memory and the 
Buffer Memory 


CHARACTERISTICS 


Memory cycle time 
Data transfer width 
Data units 


Memory capacity? 


Interleaving 
Address register 


MAIN MEMORY 


1 microsecond 

128 bits or 1 word 

(a) 128 bits per word 

(b) 4 words per block 
(с) 16 blocks per page 
(a) 64K words 

(b) 16K blocks, or 

(c) 1K pages 

4-way 

16 bits 


TCPU cycle time is also 0.08 microsecond 


{К represents a multiplier of 1,024 


BUFFER MEMORY 


0.08 microsecondst 
128 bits or 1 word 

(a) 128 bits per word 
(b) 4 words per block 
(c) 16 blocks per page 
(a) 1,024 words, 

(b) 256 blocks, or 

(c) 16 pages 

None 

10 bits 


time of one microsecond, a data transfer width (i.e., word length) of 128 bits, and а 
capacity of 64K 128-bit words (where K represents a multiplier of 1024). Moreover, 
it is 4-way interleaved. The buffer memory has a cycle time of 80 nanoseconds, a 
data transfer width of 128 bits, and a capacity of 1,024 words. Both memories are 
divided into 4-word blocks; thus, there are 16K blocks in the main memory and 


Page Block Word 
address address address 


0 


Page Block Word 
address address address 


о 


910 1314 


(а) 


34 78 


(b) 


Fig. 7.17 Memory address formats: (a) main memory address 
format; (b) buffer memory address format 


298 Chap. 7 MEMORY ORGANIZATION 


256 blocks in the buffer memory. Every 16 contiguous blocks form a page; thus, 
there are 1K pages in the main memory and 16 pages in the buffer memory. Data 
transfer between the main memory and the buffer memory is one block at a time; 
data transfer between the buffer memory and the CPU is one word at a time. The 
main memory requires a 16-bit address, while the buffer memory a 10-bit address. 
Their formats are shown in Fig. 7.17. The main memory address consists of a 10-bit 
page address, a 4-bit block address, and a 2-bit word address. The buffer memory 
address format is identical except that the page address is 4-bit. 


7.6.2 Buffering Organization 


For the buffering organization to be described here, it is assumed that the first 
page of the main memory does not exist; thus, page address 0 of the main memory 
should not occur. 

As mentioned, both the main memory and the buffer memory are divided into 
pages. During operation, 16 of the 1,023 pages of the main memory are stored in the 
16 pages of the buffer memory. These 16 pages are tagged by their main-memory 
page addresses in an array of 16 page-address registers. This arrangement of page 
mapping and page-address tagging is illustrated in Fig. 7.18. 

As also mentioned, each page in the main memory and buffer memory is divided 


Main memory 


K 


Page 1021 7 iN 
Page 1021 


16 pages Page 1022 
Page 1023 


1023 pages 


Fig. 7.18 Mapping between the pages in the main memory and 
those in the buffer memory 


Sec. 7.6 Memory Buffer 299 


into 16 blocks. The 16 blocks in a page of the buffer memory are illustrated in Fig. 
7.19, where each block 15 further divided into four words (not shown). As also shown 
in Fig. 7.19, associated with each page of the buffer memory is a register which 


P(, 0-9) Block 0 
Main-memory Block 1 
page-address 

Block 2 


register 


Block 14 
Block 15 


One page Validity register 
V(, 0-15) 


Bit for block O 


Buffer-memory 
page-address 
register 


Bit for block 15 


Fig. 7.19 Blocks in a page and the associated P, Q, and V 
registers (16 blocks in a page and 4 words in a block) 


holds a 10-bit page address of the main memory, a 16-bit block validity register whose 
16 bits store the status (1 means valid) of the 16 blocks of the page, and a register 
which holds a 4-bit page address of the buffer memory. Since there are 16 pages in 
the buffer memory, there is an array of 16 main-memory page-address registers P, 
an array of 16 validity registers V, and an array of 16 buffer-memory page-address 
registers Q. Thus, one validity register, one main-memory page address register, and 
one buffer-memory page address register are associated with one page of the buffer 
memory. 

As mentioned above, associated with each page of the buffer memory is a pair 
of registers P and Q. The P register stores the main-memory page address of the page 
in the buffer memory; the Q register stores the buffer-memory page address where 
this page in the buffer memory is stored. This is illustrated in the diagram in Fig. 
7.20. Note that the numbers shown in the buffer memory are main-memory page 
addresses; they should be the pages themselves addressed by these page addresses. 
Furthermore, not all of these pages in the buffer memory may have been stored in 
the buffer memory as will be further described. 

The array of page-address registers P is made to perform three functions. The 
first function, as mentioned, is to store the page addresses of those 16 pages in the 
main memory that are (partially or completely) in the buffer memory. The second 
function is to make array P work as an associative memory so that, given a page 
address, simultaneous comparisons with the addresses in array P are made; those 


300 Chap. 7 MEMORY ORGANIZATION 


P registers О registers Buffer memory 


Buffer-memory page addresses 


ст Ь о) ю — DOAN DAR WNH— © 


732 


Fig. 7.20 Translation between main-memory page addresses 
and buffer-memory page addresses by array-registers 
P and О 


matched are indicated in the associated match register M. The third function is to 
store an activity list; the page address which is the most recently referenced by the 
CPU is placed at the top of the list, while the page address whose page in the buffer 
memory is next to be replaced is stored at the bottom of the list. 

The above-described buffering configuration is shown in the block diagram in 
Fig. 7.21. Main memory MM is associated with address register MAR, buffer register 
MBR, and read and write control registers READ and WRITE. Buffer memory BM 
is associated with address register BAR, buffer register BBR, and read and write 
control registers RB and WB. The effective address, the data word, and the read- 
write command, all transferred from the CPU, are stored in registers S, DATA, and 
RW, respectively. In addition, register C serves as a counter, and register B is used 
to control the buffer access sequence as will be further described. This configuration 
is now described by the following CDL statements: 


Comment, configuration of buffer-memory access sequence (7.25) 

Comment, buffering control registers 

Array-register, Р(01-5,0-9), $main-memory page-address array-register 
Q(0-1 5,0-3), $buffer-memory page-address array-register 
V(0-15,0-15), $block validity array-register 

Register, M(0-15), $match register for P array-register 
C(0-1), $counter 


B, $buffer-access control register 


Sec. 7.6 Memory Buffer 


READ 


Array-register 


Q(0-15, 0-3) 


MAR (0-15) 


Main memory 
MM(0-65535, 0-127) 


МВА (0-127) 


Array-register 


P(0-15, 0-9) 


М(0-15) 


Encoder, N(0-3)=M, 


Comment, CPU registers 


Register, 


S(0-15), 
DATA(0-127), 
RW, 


DATA(0-127) 


Buffer memory 
BM(0-1023, 0-127) 


BBR(0-127) 


Array-register 


V(0-15, 0-15) 


Fig. 7.21 A configuration of memory buffering 


'$CPU effective address register 
$CPU data register 


$CPU read-write command register 


301 


B 


5) [8] 


302 Chap. 7 MEMORY ORGANIZATION 


subregister,  S(PA,BA,WA)=S(0-9,10-13,14-15), 


Comment, main and buffer memory and their associated registers 


Register, MAR(0-15), $main-memory address register 
MBR(0-127), Зтаіп-тетогу buffer register 
READ, | $main-memory read command 
WRITE, $main-memory write command 
BAR(0-9), $buffer-memory address register 
BBR(0-127), $buffer-memory buffer register 
RB, $buffer-memory read command 
WB, $buffer-memory write command 

Memory, MM(MAR)=MM(1-65535,0-127), 


BM(BAR)=BM(0-1023,0-127), 
Block, UPDATE(IF (M(1)=1) THEN (P(0-1,)<cir P(0-1,),Q(0-1,)<cir 
Q(0-1,), V(0-1,)<-cir V(0-1,)), 
IF (M(2)=1) THEN (Р(0-2,)-сіг Р(0-2,),0(0-2,)-сіг 
Q(0-2,), У(0-2,)-сіг V(0-2,)), 


IF (M(15)=1) THEN (P(0-15,)<cir P(0-15,),Q(0-15,) —cir 
Q(0-15,), V(0-15,)<cir V(0-15,)), 


Operator, J(0-15)- K(0-15,) match L 

Register, L(0-9), J(0—15), 

Array-register, К(0-15,0-9), 

/begin/ J(0).—(K(0,0)(9L(0))«(K(0,1D)(OL(1)*........ *(K(0,9)(OL(9)), 
JO) XK(1,0) L(O))«(K(L)OL(OP*..... *(K(1,9))L(9)), 


J(15)—(K(15,0)(9L(0) *(K(15, D(OL(1))*. . ..*(K(15,9)(OL(9)), 


end of operator 


The encoder above encodes the contents of match register M into a buffer- 
memory page address. The above UPDATE micro-operations update the activity 
list as will be further described. Operator match is defined as the match between 
the given main-memory page address and those 16 addresses in the P registers. This 
definition could be avoided if the previously-described operator “жж” is used. 

The sequence for accessing a word from the buffer memory is described in the 
flowchart in Fig. 7.22. When the CPU requests a memory reference, the effective 
address is transferred to the main memory address register. The array of registers P 
is then searched for the effective page address. | 

For a read operation, if the page is active, the page address is put on the top of 


Sec. 7.6 Memory Buffer 303 


Entry 


Transfer effective address to main 
memory address register 
Page-search in page address 
registers 


Read 


Write into 


main memory 


Write into 
buffer memory 


Remove page address at bottom 
of activity list and put new 
page address on its top 


Put the page address on 
top of the activity list 


Update the 


activity list 


Load the block to buffer 
memory and load the first 
word to the CPU 


Set validity bit to 1 


Return 


Read the word from 
buffer memory to CPU 


Fig. 7.22 Flowchart of read and write operations with memory 
buffering 


the activity list. If the page is not active, the page address at the bottom of the activity 
list is removed. The new page address is placed at the top, and the associated validity 
register is reset to 0 to indicate that none of the 16 blocks of the page in the buffer 
memory has been loaded from the main memory. In either case, the validity bit asso- 


304 Сһар.7 MEMORY ORGANIZATION 


ciated with the effective block address is tested. If the validity bit is 1, the block is in 
the buffer memory and the word is next read out of the buffer memory. If the validity 
bit is 0, the block (not the page) is next loaded from the main memory into the buffer 
memory. During the loading, the first word from the main memory is also transferred 
to the CPU. | 

For a write operation, the word is always written into the main memory; this is 
known as “storage through.” If the page is active, the word is also written into the 
buffer memory, and the activity list is updated. The need for dual updating is due to 
the fact that the input-output channels also communicate with the main memory. 

It may be added that channels fetch data by way of the CPU but from the main 
memory directly, without going through the buffer. Channel stores are handled in the 
same way as the CPU stores; in this way, if a channel changes data in the buffer, 
the buffer is updated. 


7.6.3 Buffer Access Sequence 


The sequential operations in accessing the buffer memory are organized as a 
sequence called the buffer-access sequence. This sequence is controlled by register B. 
The buffer-access sequence is shown in the sequence chart in Fig. 7.23. It is assumed 
that the effective address, the data word, if there is one, and the read-write command 
are initially placed by the CPU in registers 5, DATA, and RW, respectively. Note 
that it has been assumed that main-memory page address 0 does not occur. 

As shown in Fig. 7.23, when register B is set to 1, the buffer-access sequence is 
activated. The effective address in register S is transferred to the main-memory 
address register MAR. The P registers are now searched for a page address that 
agrees with the effective page address in subregister S(PA). Those matched are marked 
in match register M. At this point, further operations depend on whether a read or a 
write operation 1$ requested by the CPU. 

If it is a write operation, the data word in the DATA register is stored into the 
main memory and, if match register M does not contain 0, the activity list is updated 
by placing the matched page address on the top of the list, as will be further described, 
and the data word is stored into the buffer memory. Registers B and M are next 
reset to 0. The sequence is now completed. 

If it is a read operation, match register M is tested to determine whether the page 
is active. If the page is not active, both arrays of registers P and V are right-shifted 
and the array of registers Q is circularly right-shifted. The manner in which the P 
registers are shifted puts the effective page address on the top of the activity list and 
removes the page address at the bottom of the list; this is illustrated in the diagram 
in Fig. 7.24(a). The manner in which the Q registers are shifted makes the address 
of the newly available buffer memory page attached to the effective page address 
now at the top of the activity list. The manner in which the V registers are shifted 
resets to 0 those block validity bits associated with the effective page address. If the 
page is active, the UPDATE micro-operations as defined by the block statement in 
statements (7.25) are carried out. These micro-operations move the matched page 


Page inactive 


DO UPDATE 


BAR<Q(0,)-S(BA, WA), 
BBR<DATA, 


MBR<DATA 
WRITE-1 
MM(MAR)-MBR 


BM(BAR)<BBR 


Fig. 7.23(a) Sequence chart for buffer-memory access se- 
quence 


305 


Page active 


- 


S(PA)-P<shr S(PA)-P 
Q<cir О, 
Veshr V, 


DO UPDATE 


READ<1, 
C<countup C, 
MBR<MM(MAR) 
BAR<Q(O, )-S(BA, WA) 
BBR<MBR, 

WB-1, 
BM(BAR)<BBR, 
IF (C=1) THEN (DATA<BBR) 


BBR-BM 
(BAR) 
DATA-BBR 


MAR(14-15) ^countup MAR(14-15) 
BAR(8-9)<countup MAR(8-9) 


Fig. 7.23(b) 


306 


Sec. 7.6 Memory Buffer 307 


Effective address register S 


New page address 


Top 


Array-register P 


Array-register P 


Matched page address 


(b) 


Fig. 7.24 Two ways of updating the activity list: (a) put the 
new page address at the top of the activity list; (b) 
put the matched page address at the top of the activity 
list 


address to the top of the activity list and move the intervening page addresses down 
one position; this is illustrated in the diagram in Fig. 7.24(b). (These are also the 
micro-operations that are required to update the activity list during a write operation 
if register M does not contain 0.) While the page addresses in the P registers are being 
moved, the validity bits in registers V and the buffer-memory page addresses in regis- 


308 Chap. 7 MEMORY ORGANIZATION 


ters Q are similarly moved. The manner of handling the activity list, as illustrated in 
the diagrams in Fig. 7.24, makes the least active page address drift down to the bottom 
of the list and eventually be displaced if that page address is not referenced longest 
in time. 

After the activity list is updated as a result of the request being a read operation, 
the validity bit specified by the block address in subregister S(BA) is tested. Since the 
exact validity bit depends on the particular block address, this validity bit is addressed 
by the following symbolic subscript, 


V(0,S(BA)). (7.26) 


The above manner of addressing this particular validity bit is equivalent to the descrip- 
tion by the following 16 conditional micro-statements, 


IF (S(BA)=0) THEN (У(0,0).................. ), 
IF (S(BA)=1) THEN (У(0,1).................. ), 
IF (S(BA)=15) THEN (У(0,15)................ ), 


which are too lengthy to be desirable; thus expression (7.26) is adopted. If the validity 
bit is 1, the block of words is in the buffer memory and the particular word in the 
buffer memory is read out into the DATA register. The buffer memory address is 
formed by using the contents of register О(0,) as the page address and the contents of 
subregister S(BA, WA) as the block address and the word address. If the validity 
bit is 0, the block of words is not in the buffer memory; this block of words is now 
loaded into the memory and the validity bit is set to 1. During the loading, counter C 
is used to count the number of words, and the first word from the main memory is 
also transferred to the CPU in order to reduce the access time. Registers B and M 
are next reset to 0. The sequence is now completed. 

The buffer-access sequence in the chart in Fig. 7.23 is now described by the fol- 
lowing procedural statements: 


Comment, buffer-memory access sequence begins here (7.27) 
/W/ IF (B=0) THEN (GOTO W); 
Comment, transfer effective address to address registers 
MAR-S; 
Comment, page search in array-register P 
M(0-15)-——P(0-15,) match S(PA); 
Comment, determine read or write 
IF (RW=1) THEN (GOTO Y); - 
Comment, micro-operations for a read operation 
ГО] IF (M—0) THEN (S(PA)-P —shr S(PA)-P, Q-cir О, V-—shr V) 
ELSE (DO UPDATE); 


Sec. 7.6 Memory Buffer 309 


Comment, test block validity bit 
IF (V(0,S(BA))=1) THEN (GOTO X) ELSE (У(0,5(ВА))«-1, C—0); 
Comment, load the block to the buffer memory 
/R/ READ-], C—countup C; 
МВК«-ММ(МАК), BAR-—Q(0,)-S(BA,WA); 
BBR—MBR, WB-—1; 
BM(BAR)—BBR, IF (C—1) THEN (DATA —BBR); 
IF (C40) THEN (BAR(8-9)—countup ВАВ(8-9), 
MAR(14-15)-——countup MAR(14-15) 
GOTO R) 
ELSE (GOTO 2); 
Comment, read the word from the buffer memory 
[Х/ BAR<—Q(0,)-S(BA,WA); 
RB<i; 
BBR-—BM(BAR); 
DATA-—BBR, СОТО 7; 
Comment, micro-operations for a write operation 
/Y/ IF (М-<0) THEN (DO UPDATE); 
MBR —DATA, IF (M0) THEN (ВВЕ«-рАТА); 
WRITE], IF (M0) THEN (WB--1); 
MM(MAR)—MBR, IF (M0) THEN (BM(BAR)-—BBR); 
/Z/  B—0, M<0, GOTO W; 
END 


E] 


7.6.4 Performance Evaluation 


After examining some details of a memory buffering organization and its opera- 
tion, one may raise the question as to how its performance can be evaluated and 
how effective the organization really is. One way to evaluate the performance is to 
determine the buffer reference miss. The buffer reference miss is defined as the number 
of memory references not found in the buffer; it is also the number of times that a 
block is transferred from the main memory to the buffer. It depends on (a) the speed 
ratio between the buffer memory and the main memory, (b) the block size, the page 
size, and the buffer size, (c) the store algorithm and the replacement algorithm, and 
(d) the address patterns of the programs executed by the CPU. 

Ап address pattern is a sequence of memory addresses which resulted from 
executing the sequence of instructions and data of the program. It can be random 
or sequential. А random address pattern is a memory-address sequence in which any 


310 Chap. 7 MEMORY ORGANIZATION 


address is equally likely to occur; the probability of buffer reference miss is equal to 
one minus the ratio of buffer size to the main-memory size. Variations of block size 
and replacement algorithms have no effect on the buffer reference miss. A sequential 
address pattern is a memory-address sequence in which any address is exactly one 
word away from the preceding address in the sequence; the probability of buffer 
reference miss is exactly the inverse of the block size. Variations of the buffer size 
and replacement algorithms have no influence on the buffer reference miss. 

The actual address patterns are neither random nor sequential. Gibson (36) 
reported that, based on twenty IBM-7000-series programs each running approximately 
three million address references, the probability of buffer reference miss ranged 
approximately from 0.0025 to 0.085 with the peak occurring at about 0.015. 

Another way to evaluate the performance of memory buffering, proposed by 
Lipstay (37), is first to define an ideal system in which there is no buffer memory, 
but the main memory operates at the buffer-memory cycle time. The ideal system is 
equivalent to the buffered system when the CPU of the buffered system always found 
the data in the buffer and there is no loss of time in memory reference due to stores. 
Thus, the ideal system represents an upper limit on the performance of the buffered 
system. Lipstay reported that, based on an extensive amount of address pattern in 
evaluation, the performance of the buffered system ranged from 66 percent to 94 
percent of the performance of the ideal system. 

A third way to evaluate the two-level storage hierarchy, reported by Gibson (36), 
is first to define the transfer-rate ratio TR, 


TR=NT/NP (7.28) 


where NT is the average transfer rate in words per second between the main memory 
and the buffer memory, and NP is the average transfer rate in words per second 
between the buffer memory and the CPU. One would intuitively like to have transfer- 
rate NP relatively high and transfer-rate NT relatively low. A study which used the 
previously mentioned IBM-7000-series data was made and its result is presented in 
Table 7.6, where the TR values are tabulated as a function of block size and buffer 


TABLE 7.6 Transfer-rate Ratio, TR 


BUFFER MEMORY SIZE, WORDS 


BLOCK SIZE 32 64 128 256 512 1024 2048 4096 8192 


16 words 6.5 1.9 1.4 1.0 7 17 10 .045 .033 
32 12.1 3.0 2.0 1.2 ‚37 ‚16 .072 .039 
64 23.2 4.5 2.5 1.2 27 14 ‚043 
128 44.6 7.0 3.1 45 ‚22 .073 
256 84.1 11.3 2.5 35 09 
512 157.0 6.9 73 20 
1024 24.6 1.9 42 
2048 7.9 1.2 


Sec. 7.7 Virtual Memory 311 


size. It is apparent from the table that, to obtain a desired low transfer-rate ratio TR, 
the block size should be kept small and the buffer size large. For example, a ratio 
of 0.1 could be obtained by choosing 16 words as the block size and 2,048 words as 
the buffer size. 


7.7 Virtual Memoryt 


It has long been recognized that the requirement of the speed and capacity of 
the memory of a computer system can only be realized hierarchically in two or more 
levels. A two-level storage hierarchy is shown in Fig. 7.25, where the auxiliary memory 


Auxiliary 
memory 


Fig. 7.25 А two-level memory system 


can be one or more. A program, together with data, normally resides іп the auxiliary 
memory. When the program or a segment of the program is to be executed, it is 
brought into the main memory, because the instructions of the program or the segment 
can be executed only when they are in the main memory. Thus, one may think of 
auxiliary storage as containing the totality of information required for the complete 
execution of all computations; it is the task of the system to maintain in the main 
memory a portion of the totality that is currently active. It is the responsibility of 
the computer system, not each individual programmer, to allocate the memory space 
so that, to each program, the storage hierarchy operates as if it were at one level. 
In the early years, each programmer incorporated a storage allocation procedure 
into his program when the totality of its information was expected to exceed the main 
memory capacity. Such a procedure, called the static memory allocation, merely 
divided the program into a sequence of main memory loads (or segments) which 


+Denning [57] presented an excellent survey on virtual memory. The concepts of virtual memory 
presented here are taken from this paper. 


312 Chap. 7 MEMORY ORGANIZATION 


overlay one another. This was possible and practical because the programmer was 
familiar with the computer system and his program. With the advent of multipro- 
gramming and time sharing systems, there arises the need for running a partially 
loaded program, for varying the amount of memory in use by a given program, and 
for moving a program around in the memory during execution; this means that the 
allocation of memory space is not predictable and segment overlay can not be worked 
out beforehand. The memory allocation must be dynamic. The need for dynamic 
memory allocation for multiprogramming and time sharing systems-has led to the 
development of virtual memory. 

This section introduces the basic concept of virtual memory, describes address 
translation and paging, and illustrates scheduling of a virtual memory. 


7.7.1 Basic Concept 


The idea of one-level storage is now known as virtual memory. It gives the pro- 
grammer the illusion that the main memory is very large, even though it actually is 
relatively small. The computer hardware automatically moves information into the 
main memory when it is needed for processing. Thus, in a virtual memory, the task 
of memory allocation disappears from the program. 

The concept of virtual memory starts from the notion that “address” is a concept 
to be distinguished from “location,” though they were regarded as identical in the 
early computer systems. The address space is the set of names that may be generated 
by a program as it references information, and the memory space is the set of physical 
main-memory locations where information is stored. An address used by the pro- 
grammer is called a name or a virtual address; the set of such names is called the 
address space. An address used by the memory is called a location or a memory 
address, and the set of such locations is called the memory space. Let there be N 
names in the address space and M locations in the memory space, as illustrated in 
Fig. 7.26. There are more virtual addresses than memory locations; thus, not every 
virtual address has a memory location. 

In order to convert a given virtual address into a memory address, an address 
translation mechanism is required. An implementation of the address translation 
mechanism is shown in Fig. 7.27. The translation table is stored in memory TABLE; 
the virtual address register of the CPU and memory address register MA of the main 
memory serve as the address register and buffer register of memory TABLE, respec- 
tively. Register FAULT indicates the missing-item fault. The translation of the given 
virtual address a into main memory address b can be procedurally described as 
follows. 


Comment, description of an address-translation implementation (7.29) 
Register, УА(0-11), $address register of memory TABLE 
MA(0-9), $Бийег register of memory TABLE 


Sec. 7.7 Virtual Memory 313 


Address 
translation 
mechanism 


Memory space 
(M locations) 


Address space 
(N names) 


Fig. 7.26 Address translation between address space and 
memory space 


READ, $initiate read operation 
BLANK, $no-entry-in-memory-TABLE indicator 
FAULT, $missing-item fault indicator 


Memory, TABLE(0-4 095,0-9), $address translation table 
УА <—“а”, FAULT<0; 
READ-—1; 
MA-—TABLE(VA); 
IF (MA=0) THEN (FAULT-—0); 
END 


In the above, there are 4,096 virtual memory addresses, but there are only 1,024 real 
addresses of the main memory. When the virtual address does not have a memory 
address in memory TABLE, register FAULT is set to 1 to indicate that the word 
is not in the main memory. Otherwise, main memory address b is in memory address 
register MA. 

The names of those virtual addresses without memory locations are stored in 
the auxiliary storage. When геЃегепсе`іѕ made on such a virtual address, register 
FAULT will be set to 1; this should trigger the hardware to bring in the missing item 


314 Chap.7 MEMORY ORGANIZATION 


Address translator 


Virtual address 
register VA 


Memory address 
register MA 


Main 
memory 


Fig. 7.27 An implementation of address translation 


from the auxiliary memory into the main memory. This brings up the need for a 
replacement algorithm that decides which item in the main memory ought to go, a 
fetch algorithm which decides when the item is to be loaded, and a placement algorithm 
which decides where the item is to be placed. The address translation mechanism and 
these algorithms constitute an important part of the architecture design of a virtual 
memory. 


Sec. 7.7 Virtual Memory 315 


7.7.2 Paging 


The address translation implementation in Fig. 7.27 is obviously impractical, 
because memory TABLE would have as many words as the number of virtual 
addresses. In order to reduce the size of the translation table memory, information 
in the address space is grouped into blocks. A block is a group of contiguous addresses 
in the address space. Entries in the translation table will now refer to blocks; there 
are far fewer blocks than addresses. 

An address translation implementation is shown in Fig. 7.28, where the memory 
space is organized into unnamed (though addressable) blocks of fixed size. These un- 
named blocks are called pages. Assume that there are 1,024 words in a page. The 
translation table, now called the page table, is stored in memory PAGETABLE, 
which has 256 16-bit words. Subregister VA(P) contains the virtual page address, 
while subregister VA(W) contains the virtual word address. Subregister VA(P) is 
the address register of memory PAGETABLE. Subregister MA(Q) contains the 
main-memory page address, while subregister MA(W) contains the main-memory 
word address; subregister MA(Q) is also the buffer register of memory PAGETABLE. 
Since subregister VA(P) has eight bits but subregister MA(Q) has six bits, there are 
256 virtual page addresses but 64 memory page addresses. The translation from the 
given virtual address p-w into memory address q-w can be procedurally described as 
follows: 


Comment, description of a paging implementation (7.30) 
Register, VA(0-17), $address register of memory PAGETABLE 
MA(0-15), $buffer register of memory PAGETABLE 
READ, $initial read operation 
FAULT, $missing-page fault indicator 
Subregister, WA(P,W)=VA(0-7,8-17), 
Subregister, MA(Q,W)=MA(0-5,6-15), 
Memory, РАСЕТАВІ Е(УА(Р))--РАСЕТАВІ.Е(0-255,0-5), 
УА —"p-w", FAULT<0; 
READ-1; 
MA(Q)—PAGETABLE(VA(P), МА(\)- VA(W); 
IF(MA(Q)=0) THEN (FAULT<0); 
END 
When the virtual address does not have a memory page address in memory PAGE- 


TABLE, register FAULT is set to 1; otherwise, main memory address q-w is in 


memory address register MA. | 
There are three ways to store the page table: (a) in a small but fast memory as 
shown in Fig. 7.28, (b) in the main memory, or (с) in a very fast associative memory. 


316 Chap. 7 MEMORY ORGANIZATION 


Address translator 


Virtual address 
register VA 


Memory PAGETABLE 


Memory address 
register MA 


Main 
memory 


Fig. 7.28 A paging implementation 


In the first case, an additional memory is required as well as one extra memory 
access time. In the second case, two accesses of the main memory are required, thus 
the program runs at half speed (unless you have register-to-register instructions 
which do not reference memory). In the third case, because of the cost, only a very 
small associative memory is used. Each associative register stores one entry; only 


Sec. 7.7 Virtual Memory 317 


the most recently used entries are stored. When a given virtual page-address is to be 
referenced, the memory page-address can be generated almost immediately; otherwise, 
reference to the page table in the main memory has to be made. It has been found 
that eight to sixteen associative registers are adequate to cause programs to run at 
nearly full speed (49). 

Paging was first used in the Atlas computer system (41, 44) in England, but is 
presently used in some computer systems such as the IBM System/360 model 67, 
the RCA Spectra 70/61, and the GE 645 (and 655). 


7.7.3 Segmentation 


Another address translation implementation is shown in Fig. 7.29, where the 
address space is organized into named blocks of arbitrary size. These named blocks 
are called segments. To the programmer, a segment can be a program module or a 
data structure. The programmer references an item in the segmented address space by 
a segment name s and a word name w. A segment is stored in a contiguous area of 
main memory. The memory address at which segment s begins is called the base 
address; t indicates the segment size. 

As shown in Fig. 7.29, the address translation table, now called the segment 
table, is stored in memory SEGTABLE. Subregister VA(S) is the address register and 
register BUF is the buffer register. There are 128 virtual segment addresses, as there 
are 128 words in the SEGTABLE memory. Each word of the SEGTABLE memory 
contains a 14-bit main-memory address r and a 10-bit segment size t. If the word 
address w falls outside size t of segment s, an overflow fault is indicated. Since some 
virtual segments may not be in the main memory, these segments will have no entries 
in the SEGTABLE memory; when this occurs, a missing-segment fault is indicated. 
The translation from the given virtual address s-w into memory address (r + w) 
is procedurally described below. 


Comment, description of a segmentation implementation (7.31) 
Register, VA(0-16), $address register of memory SEGTABLE 
МА(0-13), $address register of main memory 
ВОЕ(0-23),  S$buffer register of memory SEGTABLE 
READ, $initiate read operation 
FAULT-M,  $missing-segment indication 
FAULT-OV, $segment overflow indication 
Subregister, VA(S,W)-— VA(0-6,7-16), 
BUF(R,T)=BUF(0-13, 14-23), 
Memory, SEGTABLE(VA(S))=SEGTABLE(0-127,0-23), 
Terminal, B(0-12)=VA(W)*BUF(T)’ + BUF(T)'* B(1-13)-- B(1-13)* VA(W), 
В(13)=0, 


318 Chap. 7 MEMORY ORGANIZATION 


Address 
translator 


Memory SEGTABLE 


FAULT-OV 


FAULT-M 


Main memory 


VA—“s-w”, FAULT-M<0, FAULT-OV<0; 
READ<1; | 
BUF<SEGTABLE(VA(S)); 

IF (BUF(R)=0) THEN (FAULT-M-—1, GOTO X); 


Fig. 7.29 A segmentation implementation 


Sec. 7.7 Virtual Memory 319 


IF (B(0)=1) THEN (FAULT-OV —1, GOTO X); 
MA<BUF(R) add VA(W); 
IX] END 


When the virtual address does not have a segment address in memory SEGTABLE, 
register FAULT-M is set to 1. The above terminals B are borrow terminals of the 
subtracter which subtracts virtual word address w from segment size t. When the 
difference is negative, terminal B(0) from the parallel subtracter (not shown in Fig. 
7.29) becomes 1; register FAULT-OV is set to 1 to indicate segment overflow. At 
the end, the memory address, which is the sum of r and w, is in the main memory 
address register MA. 

Similar to the paging implementation, there are three possible ways to store the 
segment table which will not be further discussed. Segmentation implementation has 
been used in the Burroughs B5000 series of computers (42, 45) and the Rice Univer- 
sity Computer (43). 


7.7.4 Segmented Paging 


While address translation by segmentation meets the programmer's need, trans- 
lation by paging is more practical for implementation by hardware. It is possible to 
combine segmentation and paging into one implementation, as shown in Fig. 7.30. 
In this implementation, each segment is divided into pages. There are both segment 
and page tables. Each entry in the segment table points to one pagetable. There is 
one segment table, but there are as many page tables as the number of entries in the 
segment table. 

As shown in Fig. 7.30, the segment table is stored in memory SEGTABLE, and 
the page tables are stored in memory PAGETABLE. Subregister VA(s) is the address 
register and register BUF the buffer register of the SEGTABLE memory. Register 
AR is the address register and subregister MA(Q) the buffer register of the PAGE- 
TABLE memory. There are 32 words in the SEGTABLE memory, allowing a maxi- 
mum of 32 segments. Each segment can have as many as 16 pages. Each page is a 
contiguous area of 256 words in the main memory. There are 512 words in the PAGE- 
TABLE memory, allowing as many as 32 page-tables with each table having as many 
as 16 entries. The given virtual address consists of a 7-bit virtual segment address s, 
a 4-bit virtual page-table address wp, and an 8-bit word address ww. The 7-bit virtual 
segment address allows a maximum of 128 virtual segments. Each entry of the segment 
table consists of a 5-bit page-table-address r and a 4-bit segment size t in number of 
pages. The PAGETABLE memory word address consists of a 5-bit page-table-name 
address r and a 4-bit virtual page-table address wp. Each entry of the page table 
consists of an 8-bit main-memory page address q. The main-memory address consists 
of an 8-bit page address q and an 8-bit word address ww. 

The address translation for the configuration in Fig. 7.30 works as follows. From 
the given virtual segment address, an entry from the segment table is obtained; other- 
wise, a missing-segment fault is indicated. From the entry of the segment table, page- 


320 


Address translator 


6 7 10 11 


Memory 
SEGTABLE 


Memory 


| РАСЕТАВЕЕ 
М 


р 


L 


Fig. 7.30 A segmented paging implementation 


Sec. 7.7 Virtual Memory 321 


table-name address r and segment size t are obtained. If the segment size is exceeded, 
an overflow fault is indicated; otherwise, the page-table is located in memory PAGE- 
TABLE by page-table-name address r, and the entry of the page table is located by 
page-table address wp. The main memory page address of the given virtual address is 
obtained from the entry of the page table, and the main-memory word address from 
given word address ww. This translation from the given virtual address s-wp-ww into 
memory address q-ww is now procedurally described below: 


Comment, description of a segmented paging implementation (7.32) 


Register, 


Subregister, 


Memory, 


Terminal, 


0.4 


УА(0-23) $ааагеѕѕ register 
BUF(0-8), $buffer register of memory SEGTABLE 


AR(0-8), Saddress register of memory PAGETABLE 
MA(0-15), Saddress register of main memory 
READS, $initiate memory SEGTABLE read 
READP, $initiate memory PAGETABLE read 


FAULT-M,  S$missing-segment indication 
FAULT-OV, Ssegment overflow indication 
VA(SWP,WW)-—VA(0-11,12-15,16-23), 
BUF(R,T)=BUF(0-4, 5-8), 
AR(R,WP)=AR(0-4,5-8), 
MA(Q,WW)=MA(0-7,8-15), 
SEGTABLE(VA(S))=SEGTABLE(0-127,0-8), 
PAGETABLE(AR)—PAGETABLE(0-511,0-7), 
B(0-3)— VA(WP)*BUF(T)' -HBUF(T) «B(1-4)-- B(1-4)* VA(WP), 
B(4)—0, 

УА -—"s-wp-ww", FAULT-M —0, FAULT-OV —0; 
READS<1; 

BUF«-SEGTABLE(VA(S)); 

IF(B(0)—1) THEN (FAULT-OV —I, GOTO X); 
AR«—BUF(R)-VA(WP); 

READP-—1; 

MA(Q)—PAGETABLE(AR), MA(WW)«—VA(WW); 
END 


As before, the segment table and thé page tables may be stored in two separate 
memories as shown in Fig. 7.30 or in the main memory. However, because of the need 


322 Chap. 7 MEMORY ORGANIZATION 


of three references for each memory access, a very fast associative memory to store 
the segment table and the page tables or at least part of these tables is essential. 


7.7.5 Scheduling 


As mentioned, paging requires three algorithms: fetch, placement, and replace- 
ment. The fetch algorithm decides when an item is to be loaded. If the loading does 
not take place until a fault occurs, it is known as demand fetch. The placement 
algorithm decides where to place an item; this often makes use of the replacement 
algorithm which decides which item is to be removed. Thus, a major consideration 
is the selection of a replacement algorithm. This choice depends mainly upon the 
behavior of programs. 

There are several commonly-known replacement algorithms which all assume 
a rather simple behavior of programs: random, the first-in-first-out (FIFO), the least 
recently used (LRU), and the Atlas algorithm. The random algorithm randomly 
selects a page to be replaced. The FIFO algorithm removes the least-recently-paged 
page. The LRU algorithm replaces the least-recently-used page. The Atlas algorithm, 
implemented in the Ferranti Atlas computer, removes pages not expected to be 
needed for the longest time; this was successful for programs with loops. 

Denning (53) proposed the idea of a working set as the model for the behavior 
of programs. Briefly, a working set is the minimum set of pages that must be loaded 
into the main memory for a task to operate efficiently without unnecessary page 
faults; thus, the working set is closely related to the task. A properly selected working 
set requires a balance between the page traffic and the unused memory space. The 
working set of a task obviously varies in size. At any one time, it can be defined as 
those pages referenced during a selected time interval just passed. Those pages not 
referenced during this time interval are candidates for replacement. This is similar 
to the LRU method. However, the working set principle states that a task may be 
loaded into the main memory (1.е., activated) only if the main memory is available 
for its working set, and that, if the main-memory availability is exceeded at any time, 
a task is selected and removed from the main memory (i.e., deactivated). Also sug- 
gested by Denning, the working set is determined by sampling the page-table entries 
of those pages that are the currently working sets in the main memory at selected 
time intervals. 

By means of the working set model, Denning proposed a scheduling, as shown 
in the diagram in Fig. 7.31. There are four queues: the Ready queue, the Running 
queue, the Blocked queue, and the Page-wait queue. A task is said to be ready if it is 
running when a processor is available. The Ready queue stores all the tasks in the 
ready state. A task is said to be in the running state, when a processor is assigned to 
the task. The Running queue stores the tasks to be processed. A task is said to be 
in the blocked state, when it is awaiting completion of another task (e.g., a nonpaged 
I/0 request). The Blocked queue stores those tasks until the tasks become unblocked; 
then they are re-entered in the Ready queue. The Page-wait queue stores those tasks 
that are waiting for the page to be brought into the main memory. Let i be task i; 


Task entry 


Store in 
ready 
queue 


Quantum 
run out 
129; 


Assign time 
quantum q; 


Store in m Store in Recycle 
blocked Mia running task 
queue Queue queue checker 


Task 
with page 
fault 


Execute task i 
for burst г 


Task burst over 


Task blocked 
Task completed 


Fig. 7.31 Scheduling for a virtual memory 


323 


324 Chap. 7 MEMORY ORGANIZATION 


t; be the processor time used by task i since it was last blocked (not due to page 
fault); q; be the time quantum assigned to task i; and r be the burst of processor 
time during which task 1 is being executed at one time. 

Let task i be traced through the diagram. When task i is created, an identifier is 
assigned and placed in the Ready queue. Tasks are selected from the Ready queue 
according to the prevailing priority rule. Once a task is selected from the Ready 
queue, it is assigned a time quantum q; which limits its time in the Running queue. 
The Running queue is a cyclic queue; task 1 is cycled through the Running queue 
repeatedly, receiving bursts r of processor time until it is blocked or exceeds its 
quantum q;. If task i is blocked, its identifier is placed in the Blocked queue where 
it remains until the task is unblocked and then re-entered in the Ready queue. There 
is a special task, called the checker, which is ever present in the Running queue. The 
checker performs the task of the management of the main memory, samples the page 
tables of each task that has received service since the last time it was run, removes 
pages according to a particular replacement algorithm, and replenishes vacancies in 
the Running queue by selecting tasks from the Ready queue according to the prevail- 
ing priority rule. Notice that time t; of task 1 1$ reset to 0 when task 1 enters the ready 
queue and is incremented by burst r when task i proceeds to the processor for execu- 
tion. 

The above idea of working set was employed with some modification in the design 
of the virtual memory of the RCA Spectra 70/46 Time-sharing System (55, 56). 


References 


1. CARLSON, C. B., “The Mechanization of a Push-down Stack," Proc. of the FJCC, 
Spartan Books, Inc., 1963, pp. 243-250. 


2. HELLERMAN, H., Digital Computer System Principles. New York: McGraw-Hill Book 
Company, 1967. 


3. GSCHWIND, H. W., Design of Digital Computers. Springer-Verlag New York Inc., 1967. 
4. FLORES, I., Computer Organization. Englewood Cliffs, М. J.: Prentice-Hall, Inc., 1969. 
5. BURNETT, С. J., and COFFMAN, F. G., JR., “А Study of Interleaved Memory System,” 
Proc. of SJCC, 1970, pp. 467-474. 
Loader 
6. Сно, Y., “Application of Content-Addressed Memory for Dynamic Storage Allocation,” 
RCA Review, March, 1965, pp. 140-152. 


‚ “Direct Execution of Programs in Floating Сойе Бу Address Interpretation,” 
IEEE Trans. on E.C., June, 1965, pp. 417-422. 


8. FLonss, I., Computer Software. Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1965. 


9. BARRON, D. W., Assemblers and Loaders. MacDonald and American Elsevier, 1969. 


References 325 


10. 


11. 


12. 


13. 


14. 


15. 


LANZANO, В. С., “Loader Standardization for Overlay Program,” Com. of the ACM, 
October, 1969, pp. 541-550. 


Parvo, O. R., “A Loader Algorithm for Microprogramming,” Technical Report 70-113, 
Computer Science Center, University of Maryland, 1970. 


SLADE, А. E., and МСМАНОМ, Н. O., “А Cryotron Catalog Memory System,” Proc. of 
the EJCC, December, 1956, pp. 115-120. 


McDermip, W. L., and PETERSEN, Н. E., “А Magnetic Associative Memory System,” 
IBM Journal of Research and Development 5, No. 1, January, 1961, pp. 59-62. 


FaALKorr, А. D., “Algorithms for Parallel-Search Memories,” J. of the ACM, October, 
1962, рр. 488-511. 


Rosin, В. E., “An Organization of an Associative Cryogenic Computer, “Proc. of the 
SJCC, May, 1962, pp. 203. 


Associative Memory 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


Davies, Р. M., “A Superconductive Associative Memory,” Proc. of the SJCC, May, 
1962, pp. 79. 


Сно, Y., “А Destructive Readout Associative Memory," IEEE Trans. оп E.C., August, 
1965, рр. 600-605. 


McKeever, B. T., “The Associative Memory Structure," Proc. of the FICC 27, part 1, 
1965, pp. 371-388. 


CAMPI, А. V., Gray, В. H., and Dunn, В. M., “Content Addressable Memory System 
Concepts," IEEE Trans. on E.C., October, 1965, рр. 168-172. 


SEEBER, R. R., and Linpquist, A. B., “Content Addressable Memories,” Proc. [FIP 
Congress, 1965, рр. 479-482. 


DUGAN, J. A., GREEN, В. S., MINKER, J., and SHINDLE, W. E., “A Study of the Utility 
of Associative Memory Processors," Proc. of the ACM, National Meeting, 1966, pp. 
347—360. 


Сно, Y., “А Programming Study of a Non-numerical Processor," Technical report 
56-67, Computer Science Center, University of Maryland, November, 1967. 


CRANE, B. A., “Path Finding with an Associative Memory," JEEE Trans. on E.C., 
July, 1968, pp. 691-693. 


FELDMAN, J. A., “An Algol-Based Associative Language,” Comm. of the ACM, August, 
1969, pp. 439-449. 


MINKER, J., “An Overview and Bibliography of Associative or Content-Addressable 
Memory Systems,” 1956-1968, Auerbach Corporation, 1969. 


TAYLOR, В. T., “Associative Memory Concepts,” unpublished notes, Computer Science 
Center, University of Maryland, May, 1970. 


Memory buffering 


27. 


ECKERT, J. P., Сно, J. C., Томк, A. B., and ӛснміт, W. F., “Design of UNIVAC-LARC 
System: І,” Proc. of the EJCC, 1959, рр. 59-65. 


326 Chap. 7 MEMORY ORGANIZATION 


28. BLOOM, L., COHEN, M., and Porter, S. Considerations in the design of a computer with 
high logic-to-memory speed ratio," Proceedings of Gigacycle Computing Systems, AIEE 
Special Publication S-136, 1962, pp. 53-63. 

29. TAKAHASHI, S., NISHINO, H., YOSHIHIRO, K., and Еоснь К. System Design of the ETL 


Mk-6 Computers, Information Processing 1962, Amsterdam, The Netherlands: North 


Holland Publishing Со, 1963, р. 690. 
30. , Ferranti Computing Systems. Atlas 2, London: Ferranti Ltd., 1963. 


31. ГЕЕ Е. F., “Look-Aside Memory Implementation,” Memorandum MAC-M-99, M.I.T., 
Cambridge, Mass., August 19, 1963. 


, “Look-Aside Memory Simulation,” Memorandum MAC-M-131, М.Т. 
Cambridge, Mass., January 2, 1964. 


33. WiLkes, M. V., “Slave Memories and Dynamic Storage Allocation," IEEE Trans. оп 
E.C., April, 1965, pp. 270-271. 


34. SCARROTT, G. G., “The Efficient Use of Multilevel Storage,” Proc. of the IFIPS Congress 
1965, Spartan Books, 1965. 


35. BELADY, L. A., “A Study of Replacement Algorithms for a Virtual Storage Computer,” 
IBM Systems Journal 5, No. 2, 1966, pp. 78-101. 


36. GiBsoN, D. H., “Considerations іп Block-Oriented Systems Design,” Proc. of the SJCC, 
1967, pp. 75-80. 


37. Lipstay, J. S., “Structural Aspects of the System/360 Model 85, the Cache,” ІВМ 
Systems Journal 7, No. 1, 1968, pp. 15-21. 


38. Sisson, S. S., and FLYNN, J. J., “Addressing patterns and memory handling algorithms,” 
Proc. of the FJCC, 1968, Thompson Book Company, рр. 957-967. 


39. Leg, Е. F., “Study of ‘Look-Aside’ Memory,” IEEE Trans. on E.C., November, 1969, 
1062-1064. 


40. MATTSON, R L., GECESEI, J., SLUTZ, D. R., and Tracer, I. L., “Evaluation Techniques 
for Storage Hierarchies,” JBM Systems Journal 9, No. 2, 1970. 


32. 


Virtual Memory 
41. FOTHERINGHAM, J., “Dynamic Storage Allocation in the Atlas Computer, Including 
an Automatic Use of a Backing Store,” Com. of the ACM, October 1961, p. 435. 


‚ “Тһе Descriptor—A Definition of the B5000 Information Processing System,” 
Burroughs Corporation, 1961. 


42. 


43. Ілеге, J. K., and Јорет, J. G., “А Dynamic Storage Allocation Scheme,” British 
Computer Journal, October, 1962, pp. 200-209. 


44. KILBURN, T., DEwARDS, D. В. G., LANIGAN, M. J., and Sumner, F. H., “One-Level 
Storage System,” IRE Trans. on E.C., April, 1962, pp. 223-235. 


45. МАСКемдив, F. B., “Automated Secondary Storage Management,” Datamation 11, 
1965, рр. 24-28. Ў 


46. DENNIS, J. B., “Segmentation and the Design of Multiprogrammed Computer Systems,” 
J. of the ACM, October, 1965, pp. 589-602. 


Problems 327 


47. 


48. 


49. 


50. 


ARDEN, В. W., GALLER, В. A., O’BRIEN, T. C., and WESTERVELT, Е. H., “Program and 
Address Structure in a Time Sharing Environment," J. of the ACM, January, 1966, 
pp. 1-16. 


BELADY, L. A., “А Study of Replacement Algorithms for Virtual Storage Computers," 
1ВМ Systems Journal 5, No. 2, 1966, pp. 78-101. 


SHEMER, J., and SniPPEY, B., “Statistical Analysis of Paged and Segmented Computer 
Systems," JEEE Trans. on E.C., December, 1966, рр. 855-863. 


WILKES, M. V., Time Sharing Computer Systems. New York: American Elsevier Publish- 
ing Co., 1968. 


51. RANDELL, B., and KUEHNER, C. J., "Dynamic Storage Allocation Systems," Com. of the 
ACM, May, 1968, pp. 297-305. 

52. DALEY, R., and Dennis, J. B., “Virtual Memory, Processes, and Sharing in Multics,” 
Com. of the ACM, May, 1968, pp. 306-312. 

53. DENNING, P. J., "The Working Set Model for Program Behavior," Com. of the АСМ, 
May, 1968, pp. 323-333. 

54. BENOUSSAN, A., CLONGEN, C. T., and DALEY, В. C., “The Multics Virtual Memory,” 
Proc. 2nd ACM Symposium on Operating System at Princeton October, 1969, pp. 30-42. 

55. OPPENHEIMER, G., and WEIZER, N., “Resource Management for a Medium Scale Time- 
Sharing Operating System," Com. of the ACM, May, 1968, pp. 313-322. 

56. WEIZER, N., and OPPENHEIMER, G., Virtual Memory Management in a Paging Environ- 
ment," Proc. of the SJCC. 1969, pp. 249—256. 

57. DENNING, P. J., “Virtual Memory,” Technical Report No. 81, Department of Electrical 
Engineering, Princeton University, January, 1970. 

Problems 

7.1. A 4 by 4 electronic crossbar switch is connected at one side to four memory modules 


and at the other side to four processors and channels. The switch has a data transfer 
width of 36 bits and is capable of bilateral transfers. Select necessary computer elements 
and describe the configuration and operation of the crossbar switch by CDL statements. 


7.2. In order to use multiple-level indirect addressing described by statements (7.4) properly, 


7.3. 


7.4. 


7.5. 


what kind of operand can be indirectly addressed ? 

Conceive a configuration and then describe an illustration of multiple indexing by 
using the procedural version of the CDL. 

Describe in detail the difference between the base addressing described in statements 
(7.8) in this chapter and that described in Chapter 9. 

Statements (7.18) describe the push-down stack-adjustment operation. Describe by 
the CDL statements the pop-up stack-adjustment operation. 


328 


7.6. 


7.7. 


7.8. 


7.9. 


7.10. 


7.11. 


7.12. 


7.13. 


7.14. 


7.15. 


Chap. 7 MEMORY ORGANIZATION 


By means of statement descriptions (7.16) and (7.17), work out a numerical example 
to illustrate: 

(a) the match-on-larger-than operation, and 

(b) the match-on-smaller-than operation. 


By means of statement descriptions (7.16) and (7.18), work out a numerical example to 
illustrate: e te 

(a) the match-on-maximum operation, and 

(b) the match-on-minimum operation. 


In the buffering organization of Fig. 7.21, it is assumed that the first page of main 

memory does not exist. 

(a) What is the significance of this assumption? 

(b) What changes in the buffering configuration and buffer access sequence are 
required if the first page does exist and is permitted to be used? Both sequence chart 
and statement description are required. 


Suggest another replacement algorithm for the buffering organization of Fig. 7.21. 
Make the necessary changes. Implement the new replacement algorithm in the buffer 
access sequence. Both the sequence chart and the statement description are required. 


What are some important differences between the two-level storage hierarchies in 
Figs. 7.16 and 7.25? 


If the page table in the paging implementation of a virtual memory described in state- 
ments (7.30) is stored in an associative memory with eight registers and in main 
memory instead of being stored in memory PAGETABLE, describe the paging 
implementation by CDL statements. 


If the segment table in the segmentation implementation of a virtual memory described 
in statements (7.31) is now stored in an associative memory with eight registers and in 
main memory instead of being stored in memory SEGTABLE, describe the segmenta- 
tion implementation by CDL statements. 


Select a microprogram-control configuration and microprogram the dynamic loader 
described by statements (7.23) and (7.24). 


Select a microprogram-control configuration and microprogram the memory buffering 
described by statements (7.25) and (7.27). 


Microprogram the memory buffering design as described in Problem 7.7. 


A control unit generates control signals for a unit to Бе operated upon. Various 
controls exist in a computer system: the control for the processing unit, the control 
for the memory unit, the control for the channel, the controls for the 1/0 devices, 
and the control for the operator. There are two approaches by which the control 
signals are generated; these are presented in the first two sections. Controls are 
also required between two system units which communicate with each other; 
these are usually operated asynchronously. Asynchronous control organization is 
illustrated in the third section. The organizations for performing the functions of 
sequencing, addressing, and priority interrupt in controlling the processing unit are 
described in the last section. 


Control Organization 8 


8.1 Sequential-logic Control Organization 


A control unit generates signals to control single-step micro-operations, a 
sequence of micro-operations, or multiple sequences of micro-operations. Thus, the 
control signals required to be generated by a control unit are: 


1. single-step control signals, 
2. a sequence of control signals, or 


3. multiple sequences of control signals. 


Such a control unit can be realized by the sequential logic or by a microprogram. 
Sequential logic control refers to the use of logic circuitry to generate control signals, 
while the microprogram control makes use of a program stored in a control memory. 
This section describes the control organization using sequential logic, and the next 
section describes that using a microprogram. 


8.1.1 Single-step Control 


To generate single-step control signals, one may simply use a register. An example 
of such a control organization is described below. 


Comment, generation of single-step control signals by a register (8.1) 
Register, Е(0-7), 
Clock, P, 


Terminal, K(0-7)—F(0-7)«P, 


The eight terminal K's from register F give eight single-step control signals. 
The terminals are active for those bits of register F which contain a 1. Therefore, 
up to eight control signals can exist at the same time. 

Alternatively, one may use the combination of a register and a decoder. Ап 
example of such a control organization is described below. 

Comment, generation of single-step control signals with a decoder (8.2) 

Register, Е(0-2), 5 

Decoder, К(0-7)--Е, 

Clock, P 

Terminal, M(0-7)—K(0-7)«P 


Sec. 8.1 Sequential-logic Control Organization 331 


The eight terminal M’s from the decoder give eight single-step control terminals. In 
this case, only one of the eight terminals can be active at one time. 


8.1.2 Single Sequence Control 


To generate a sequence of control signals, one may use a circularly-shifted regis- 
ter. An example of such a control organization is described as follows: 


Comment, generation of a sequence of control signals by a shift register (8.3) 
Register, Е(0-7), 

Clock, Р(1-2), 

Terminal, К(0-7)--Е(0-7)«Р(1), 

/P(2)/ Fecir F, 


In the above, it is assumed that only one bit of register F initially contains а 1. 
As the contents of register F are shifted one bit to the right every clock phase P(2), 
a sequence of control signals K’s are generated from the successive bits or register 
F (e.g., from bit F(0)-bit F(7) and back to bit F(0)). If a branch in the sequence of 
control signals is required, the circular shift microstatement above should be replaced 
by those microstatements that will describe the desired sequencing and branching. 

Alternatively, a sequence of control signals can be generated by using the combi- 
nation of a counter and a decoder. An example of such a control organization is 
described below: 


Comment, generation of a sequence of control signals by a counter (8.4) 

Register, С(0-2), 

Clock, P(1-2), 

Decoder, К(0-7)--С, 

Terminal, M(0-7)=K(0-7)*P(1), 

/P(2)/ C<countup C, 
The above gives a sequence of eight control signals from terminals М(0)-М(7) and 
returns to M(0) with the assumption that register C is initially 0. If a branch in the 
sequence of control signals is required, the above counting micro-operation should 


be changed into those that will describe the desired sequencing and branching; in 
this case, the counter becomes a special sequence counter. 


8.1.3 Multiple-sequence Control 


To generate multiple sequences of control signals, one may use the combination 
of a register and a multiple-phase clock. An example of such an organization 15 
described below. 


332 Chap. 8 CONTROL ORGANIZATION 


Comment, generations of multiple sequences of control signals (8.5) 
Register, F(1-4), 
Clock, P(1-8), 
Terminal, K1(1-8)—F(1)*P(1-8), 
K2(1-8)=F(2)*P(1-8), 
K3(1-8) - F(3)«P(1-8), 
K4(1-8) - F(4)*P(1-8), 
The above organization generates four sequences КІ, K2, КЗ, and K4 with eight 
control signals in each sequence. Since the clock generates a fixed sequence, these 
sequences of control signals cannot be branched. However, up to four of these 
sequences can be simultaneously generated. 
An alternative way of generating multiple sequences of control signals is to use 


the combination of a counter and a multiple phase clock. An example of such an 
organization is described below. 


Comment, generation of multiple sequences of control signals with a counter (8.6) 
Register, (С(0-2), Е(1-4), 
Decoder, К(0-7)--С, 
Clock, Р(1-2), 
Terminal, М1(0-7)=Е(1)*К(0-7)*Р(1), 
M2(0-7) = F(2)*K(0-7)*P(1), 
М3(0-7)= F(3)*K(0-7)*P(1), 
M4(0-7)=F(4)*K(0-7)*P(1), 
/Р(2)/ C<—counter C, 


The organization above generates four sequences M1, М2, M3, and M4 with eight 
control signals in each sequence. If a branch in one or more of these sequences is 
required, the above counting microstatement should be replaced by those micro- 
statements which describe the desired sequencing and branching of the particular 
sequence or sequences. 


8.1.4 Timing 


The main memory operates on a memory cycle. For recent large-scale computers, 
the memory cycle time ranges from 0.5 to 2 microseconds. The access time is the 
time when the word becomes available at the memory buffer register (see Fig. 8.1). 
It ranges from 4 to 4 of the memory cycle time. The CPU also operates on a cycle 
time. The CPU cycle time (or clock period) for recent computers ranges from 50 or 
less to 500 nanoseconds. Each CPU cycle time consists of a number of clock phases. 


Sec. 8.1 Sequential-logic Control Organization 333 


Address Read word in 


transfer buffer register Write operation 


complete 


Write word must already 
be in buffer register 


' 
: Write 
Access time 
Memory cycle time 


(a) 


Р(0) Р(2) Р(4) 
Р(0) 2 4 6 8 10 P(O) Time 
(b) 
Time 
P(6) 8 10 0 2 4 P(6) 


(с) 


Fig. 8.1 Main memory and CPU timing: (a) memory cycle; (b) 
CPU cycle or clock cycle Р(0-11); (c) offset clock 
cycle 


For example, the CPU cycle time of the IBM System/360 model 40 is 625 nanoseconds, 
and the cycle has eight clock phases. The CPU cycle time of the IBM System/360 
model 85 is 80 nanoseconds, and the cycle has two clock phases. The CPU cycle 
time of the RCA Spectra 70/45 is 480 nanoseconds, and the cycle has four clock phases. 

The CPU cycle time is sometimes chosen to be the same as the memory cycle 
time, and the number of clock phase of the CPU cycle time is chosen to match the 
speed of the logic circuitry. An example where the memory cycle time 1$ equal to the 
CPU cycle time is shown in Fig. 8.1. The timing of the following events in a memory 
cycle is important: 


(a) The time when the address is transferred to the memory address register, 


(b) The time when a memory operation is initiated (this may often be the time just 
after the address 15 transferred), 


334 Chap. 8 CONTROL ORGANIZATION 


(с) The time when the word is read into the buffer register (1.е., the access time) in a read 
operation, 


(d) The time when the word to be written must already be in the buffer register in a write 
operation, and 


(e) The time when the word has just been written into the memory in a write operation. 


The above timing is indicated in Fig. 8.1(a). The CPU cycle has 12 clock phases as 
shown in Fig. 8.1(b); thus, the memory cycle is divided into 12 periods. (The 12 clock 
phases іп Fig. 8.1(b) and (с) should be drawn in 12 coordinates.) Note that clock Р(1) 
occurs at the beginning of that clock period, but the micro-operations which clock 
P(i) activates will complete before the end of that clock period. For the memory and 
CPU timing shown in Fig. 8.1(a) and (b), the memory operation may be described as 
follows: 
Comment, description of a main memory (8.7) 
Register, C(0-14), S$address register 
Г(0-14) $ргоргат register 
R(0-23) $buffer register 
Memory, М(С)-- М(0-32767,0-23), 
Clock, Р(0-11), $CPU clock 


Comment, the CPU cycle time and memory cycle time are the same (8.8) 
Comment, memory read operation 

/P(0)/ C-D, $address transfer 

/P(5)/ К«-М(С) $word arrives at buffer register 

Comment, memory write operation 

/P(0)/ C-D, $address transfer 

/P(10)/ М(С)< К, $word is stored. 

The clock cycle is sometimes made offset with respect to the memory cycle. An 


offset clock cycle is shown in Fig. 8.1(c). The memory operation with the offset clock 
cycle is described below: 


Comment, memory read operation (8.9) 
/P(60) C-D, $address transfer 
[Р(11)/ R«-M(C), Sword arrives at buffer register 


Comment, memory write operation ` (8.10) 
/P(6)/ |C—D, $address transfer 
/P(0)/ 


РАУ) М(С)< К, $word is stored 


бес. 8.2 Microprogram Control Organization 335 


The offset above makes the fetched word ready in the memory buffer register at 
the beginning of clock period P(0). 


8.2 Microprogram Control Organization 


Instead of using sequential logic, a control unit may also be realized by means 
of a microprogram stored in a control memory. The concepts of microprogram control 
and the organization of microprogram computers were presented in Chapter 3. This 
section first reviews the microprogram control organization and then describes several 
additional aspects of microprogram control. 


8.2.1 А Microprogram Control Unit 


The configuration of a microprogram control unit or MCU is shown in Fig. 
8.2. It consists of control memory M, address register H, buffer register F, decoders, 
a counting network, and a branch logic network. Each word of the control memory 
is called a control word or a micro-instruction. The buffer register also serves as the 
control word register. This configuration is now partially described below. 


Comment, organization of a microprogram control unit (8.11) 
Register, H(0-7), $address register 

Е(1-36), $control word register 

A(S, 1-23), $accumulator 

С(0-3), $counter 
Subregister, F(ADDR)=F(0-7), $address field 


Memory, CM(H)=CM(0-255, 1-36), $control memory 

Clock, P(0-1), 

/Р(0)/ IF (С=0) THEN (H<countup H) ELSE (H-—F(ADDR)), 
/P(1)/ F—CM(H), 


A control word in the control word register can generate a multitude of single- 
step control signals. The designation of these control signals constitutes the format 
of the control word. А sequence of micro-instructions in the control memory forms 
a microprogram ; it generates a sequence of control signals with a multitude of single- 
step control signals at each step of the sequence. À micro-program may also be 
Written to generate multiple sequences of control signals. 

The micro-instructions of a microprogram are sequenced by controlling the 
address of the next control word. As shown in the above execution statements, this 
address is obtained, (a) by incrementing address register H, (b) by transferring 
the address field of the control word to address register H, or (c) by using a branch 
condition-code field in the control word and the branch logic network, as indicated 
in Fig. 8.2. 


336 Chap. 8 CONTROL ORGANIZATION 


Address 
register 


Main memory 


=] ” 


Control memory 


CM 
Next address 
—^— 


WINE. 


Branch 
logic ei 
—— 


Control signals 


Condition 
signals 


Fig. 8.2 Configuration of a microprogram control unit 


The use of a condition code for branching requires that the value (0 or 1) of the 
low-order bit of the next-control-word address be made to depend on the condition 
code; this offers a two-way branching without the need of using two address fields 
in the control word. If the condition code is 3-bit and if three of the eight condition 
codes are chosen to set the low-order bit to 0 and five are chose to set to 1, this allows 
the address field to specify the address of the next control word in any two neighboring 


Sec. 8.2 Microprogram Control Organization 337 


control memory locations; there is no need of using the address incrementing micro- 
operation. To be specific, the above two-way branching is described by the follow- 
ing statements: 


Comment, an example of two-way branch using 3-bit branch condition 
code (8.12) 


Subregister, Е(СС)--Е(8-10) $condition code field 
IF ((Е(СС)--0)--(Е(СС)--3)--(Е(СО)--5)) 
THEN (H(0-6)—F(0-6),H(7)-—0), 
IF ((F(CC)=1)-+(F(CC)=4)+ (F(CC)=6)+ (F(CC)— 7)) 
THEN (H(0-6)—F(0-6), H(7).—1), 
IF ((F(CC)=2)*(A(S)=1)) THEN (H(0-6)-—F(0-6),H(7)—1), 


The use of the condition code for branching is somewhat restrictive because the two 
branched micro-instructions have to be at two neighboring locations. 


8.2.2 Timing 


Most control memories of today's microprogrammed computers are read-only 
memories, or ROM. They often have a capacity of several thousand words and a 
word length of 50 to 120 bits. In a ROM, the time when the next-control-word address 
is transferred to the address register and the time when the word is read into the 
control word register (i.e., the access time) are important. These are illustrated in the 
timing diagram in Fig. 8.3(a). 

The CPU cycle time is sometimes chosen to be the same as the ROM cycle time. 
If the ROM cycle time is chosen to be longer, the CPU operates at the ROM speed. 
If the ROM cycle time is chosen to be shorter, the CPU operates at the CPU rate. 
In either case, the hardware may not be well utilized. 

During each ROM cycle, a micro-instruction is fetched from the ROM and the 
CPU micro-operations are then executed. Thus, the above microprogram control 
unit calls for a two-phase clock. As shown in Fig. 8.3, during the first phase, the 
address of the next control word is transferred to address register H, and, at the same 
time, the CPU micro-operations activated by the control bits in control-word register 
Е аге carried out. During the second phase, the next control word is read out of control 
memory M into control word register F. These two phases form the control cycle 
of the microprogram control unit; thus, the CPU micro-operations in each control 
cycle are controlled by one micro-instruction. 

If the MCU makes use of condition-code branching, a three-phase clock is 
required. During the first phase, the address of the next control word is transferred. 
During the second phase, the control word is read into the control word register. 
During the third phase, the CPU micro-operations are performed. Since the condition 
specified by the condition code may not be known until the CPU micro-operations 


338 


Chap. 8 CONTROL ORGANIZATION 


Control word 
arrives at control 
word register 


Address transfer 


Time 


(a) 


Time 
(b) 


Fig. 8.3 ROM and clock timing: (a) ROM cycle; (b) clock 
Р(0-1) 


аге executed, the address transfer and the execution of the CPU micro-operations сап 
not occur during the same clock phase. Two separate clock phases are required. As 
an example, the sequencing of a MCU with condition-code branching is described 


as follows. 
Comment, an example of sequencing with condition-code branching (8.13) 
Clock, Р(0-2), 
[Р(0)/ IF ((Е(СС) = 2)*(А(5)=1) THEN (H(0-6)<—F (0-6), H(7)<1), 
/P()/  F—CM(H), 
/[P(2/ Execute CPU micro-operations 


If the CPU micro-operations require multi-step control signals, then a clock 
with more than three phases is required. In short, the choice of the number of clock 
phases depends on the control signals required by the CPU and the MCU micro- 


operations. 


8.2.3 Control Hierarchy 


The characteristics of the control memory in the above MCU play an important 
role. At the current state of technology, the speed of the control memory limits the 


Sec. 8.2 Microprogram Control Organization 339 


CPU speed, and the capacity limits the microprogram size. A similar situation once 
existed in the main memory, but a storage hierarchy of two or more levels is now 
commonly used. Similar to the storage hierarchy, a control hierarchy was conceived. 
Figure 8.4 shows a two-level control hierarchy which makes use of two previ- 
ously-described MCU. As shown in Fig. 8.4, an address field of the control word of 


Address input 


Control memory 
CM2 Second-level 


Next address Microroutine address 


— 
| [p] i Je 
Branch 
logic 


Condition Decoders Microprogram First-level 
signals control unit 


Control signals 


Fig. 8.4 Configuration of a two-level microprogram control 
hierarchy 


the second-level MCU specifies the address of a first-level microprogram (which may 
be called a microroutine). It is reasonable to assume that the first-level control memory 
MI is faster in speed but smaller in capacity than the second-level control memory 
M2. Those most frequently used microprograms in micro-instructions are stored in 
the first-level control memory MI, while *macroprograms" in “macro-instructions” 


340 Chap. 8 CONTROL ORGANIZATION 


are stored in the second-level control memory M2. Communication between the two 
levels should be provided. 

Instead of one address field for specifying a microroutine address, there can be 
two or more address fields so that two or more МСО? are used. Furthermore, 
instead of two levels, three or more levels can be used. 

An alternative way to use the control hierarchy is to make the computer user- 
microprogrammable. In this case, elementary microprograms are stored in the first- 
level control memory. User's programs are stored іп the second-level.control memory 
which is a read-write control memory. If the user is a system programmer, then system 
control microprograms are stored in the second-level control memory. Since the 
second-level control memory is faster than the main memory, a microprogrammable 
computer would be more flexible and more productive than a conventional store- 
program computer. 


8.2.4 Control Word Format 


The task of formulating the control word format is not a simple one. On the one 
hand, the control word has to accomplish the following functions: 


1. to group and designate the control bits of the control word, 
2. to sequence the control words of the microprogram, and 


3. to retain some flexibility for re-microprogramming. 


On the other hand, these functions should be performed with minimum bits in the 
control word, minimum words in the control memory, and minimum time in executing 
the microprogram. This section describes some techniques that are useful in formulat- 
ing the control word format. 

There are several techniques to designate and arrange control bits in the control 
word: (a) the direct control, (b) the bit grouping, (c) the multiple formats, and (d) 
vertical microprogramming. The direct control causes each control bit to generate 
one signal to control one micro-operation. There can be as many as several hundreds 
of micro-operations in a large-scale computer; this requires a long control word. The 
number of control bits can be reduced by means of grouping in any or all of the 
following three ways. One way is to group those micro-operations that always occur 
at the same time and, in this way, only one control bit controls several micro-opera- 
tions. Another way is to group those micro-operations of which only one of them 
occurs at one time; this grouping is achieved by encoding these micro-operations 
with a multiple-bit field. The third way to reduce the number of bits is to group 
several] micro-operations into a sequence and to control this sequence by one control 
bit in conjunction with a multiple-phase clock. These techniques have been discussed 
in Chapter 3. The number of bits can be further reduced by using multiple control- 
word formats. This is achieved by having one field specify the way each control word 
is interpreted. Since the control word is usually divided into a number of fields, the 
idea of multiple control-word format may also be applied to multiple field formats. 


Sec. 8.2 Microprogram Control Organization 341 


In this case, a field is set aside to specify how the bits in some other fields of the 
control word are interpreted. The use of multiple formats can save a significant 
amount of control bits at the expense of more complex decoding circuitry. The 
reduction of control bits by any of the above-described techniques is achieved at the 
expense of losing flexibility for remicroprogramming. 

The above manner of using the control bits to generate the control signals 
directly is called the horizontal microprogramming. It often suffers from low utilization 
of the control bits, since a small number of combinations of the control bits are 
meaningful and useful. Another manner which makes use of the opcode and address 
fields of the conventional instruction format is called the vertical microprogramming. 
In the vertical microprogramming, it is more desirable to have an arithmetic and/or a 
logical unit whose operations can be readily controlled by the op-codes in the op-code 
field. The operands for this unit are usually stored in some registers; thus, three 
additional fields are provided to designate the two operand registers and the register 
to store the result. In order that an operand can be a given constant, a special field 
called the emit field is provided to store a constant. The constant in the emit field 
will be transferred by hardware to a specified destination. 

The second important function of the control word is sequencing, branching, 
and looping of the control words in the microprogram. There are several ways to 
accomplish this function: (a) incrementing the current address, (b) using the next 
control-word address, (c) using the branch condition code, (d) using the branch 
source code, and (e) using a counter or a status register. The sequencing by increment- 
ing the current address in the address register is the simplest way to obtain the next 
control-word address, and it needs no control bit. However, it is preferable to specify 
the next-control-word address in an address field of the control word for sequencing, 
because the microprogram can be more effectively prepared if successive micro- 
instructions can be in any locations of the control memory. In addition, it offers an 
economical way for branching in bits as follows. For a two-way branching, two next 
addresses are normally required. However, only one next control-word address is 
needed if the low-order bit of this next control-word address can be modified accord- 
ing to some branch condition code, as illustrated previously in statements (8.12). 
Instead of allowing one low-order bit for two-way branching, it is possible to use two 
low-order bits for four-way branching, three low-order bits for eight-way branching, 
and so forth. Furthermore, instead of the low-order bits, it is possible to use the 
high-order bits for multi-way branching. To use condition code field for multi-way 
branching costs only a few bits, but the branching is restrictive because the branched 
control words are limited to certain locations of the control memory. The above 
branching can be made more effective if the branch source code field is additionally 
used. The branch source code enables one to have the choice of having the next 
control-word taken either from the address in the next control-word address field or 
from some other source such as the address in another register. Multi-way branching 
can thus be obtained by this technique. Alternatively, branching can be achieved by 
using a status register which stores the data condition when it is available and is then 
used later for branching. This technique avoids duplication of control words before 
the branch is actually necessary. It also saves control bits, because sequencing infor- 


342 Chap.8 CONTROL ORGANIZATION 


mation is held external to the control memory. For looping, a counter is required. 
The breaking out of a microprogrammed loop can be achieved by causing a branch 
when the counter reaches a particular value. The loop, wherever used, saves the num- 
ber of control words. It can also save control bits, because sequencing information 
is held externally and not by the control word. 

The use of the above techniques in formulating the control word format is 
illustrated in the following descriptions of a microprogrammed CPU and a micro- 
programmed I/O control unit. 


8.2.5 A Microprogrammed CPU 


The configuration of the microprogrammed CPU (12) is shown in Fig. 8.5. The 
CPU consists of accumulator A, buffer register B, instruction register IC, status 
register STAT, a parallel adder, a shifter, and a zero tester. In addition, there are 
main bus MB, bus L for the left inputs of the parallel adder, and bus R for the right 
inputs of the parallel adder. Let S be the output terminals of the parallel adder, Q be 
the output terminals of the shifter, IN be the input terminals, and OUT be the output 
terminals. The parallel adder can add two operands with input carry C(31) being 0 
or 1. This configuration is described by the following statements. 


Comment, a microprogrammed CPU organization 


Comment, CPU configuration (8.14) 
Register, А(0-15), $accumulator 

B(0-15), $buffer register 

IC(0-15), $instruction counter 

STAT(0-15), Scontrol register 
Bus, MB(0-15), $main bus 

L(0-15), $ALU left-input bus 

R(0-15), $ALU right-input bus 


Comment, input and output terminals 
Terminal, IN(0-15), 


OUT(0-15), 
Comment, description of the zero tester (8.15) 
Terminal, ZT=MB(0)’*MB(1)’*MB(2)’*...... +*MB(15)’, 
Comment, description of the parallel adder (8.16) 


S(0-31)=L(0-31)Q@R(0-31)OC(0-31), 
C(0-30) — L(1-31)* R(1-31) + R(1-31)*C(1-31)+C(1-31)*L (1-31), 


Sec. 8.2 Microprogram Control Organization 343 


H | 
| 
Control 
n Pen Dl Clock 


Next address Етте 


JEJE 
OUT 


B 
Bus L Bus R 


Carry С(31} 


Parallel adder 


IN ZT 
tester 


Q Main bus MB 


Fig. 8.5 A CPU configuration 


There are a large number of possible data paths in the configuration in Fig. 8.5. 
These paths are described by connect micro-operations such as connecting shifter 
output Q to main bus MS (MS=Q), or connecting terminals $ to Q with a l-bit 
leftshift (Q—shl S), or micro-operations such as transferring the signals on main bus 
MB to register A (A«-MB). (The controlled connections of terminals are called 
gating by certain hardware groups to mean that the data flow can be gated to differ- 


344 Chap. 8 CONTROL ORGANIZATION 


ent paths.) These micro-operations are grouped and coded in the assigned fields with 
one code of the field usually designating one such micro-operation. A 2-bit field 
CA is chosen to designate connections to bus L as shown in Table 8.1. A 3-bit 
field CB is chosen to designate connections to bus R as shown in Table 8.2. A 2-bit 
field COP is chosen to designate the micro-operations of the parallel adder as shown 
in Table 8.3. A 2-bit field CSH is chosen to designate the connect operations of the 
shifter as shown in Table 8.4. A 3-bit field CMB is chosen to designate the connec- 


TABLE 8.1 Field CA 


CODE CONNECT MICRO-OPERATION 
00 L=0 
01 L(0-7, 8-15) =0-F(EM), 
10 140—7, 8-15) =F(EM)-0, 
11 L=A 


TABLE 8.2 Field CB 


CODE CONNECT MICRO-OPERATION 
000 R=0, 
001 R=B, 
010 R=B’, 
011 R=IC, 
100 R=STAT, 
101 
110 | Not used 
111 


TABLE 8.3 Field COP 


CODE OPERATOR 
00 add with C(31)=0 
01 add with C(31)=1 
10 
П | Not used 


TABLE 8.4 Field CSH 


CODE CONNECT MICRO-OPERATION 
00 MB=Q, О--5, 
01 MB=Q, Q=shr S, 
10 MB=Q, Q=shl S, 


11 MB=IN, Е 


Sec. 8.2 Microprogram Control Organization 345 


tions to main bus MB as shown in Table 8.5. The control memory has an 8-bit address. 
Of them, six high-order bits Н(0-5) are provided by the next-address field CNA, 
while the two low-order bits H(6, 7) are determined by fields CAB and CBB as shown 
in Tables 8.6 and 8.7. Storing the zero test result and the sign bit in register STAT is 
designated by field CST as shown in Table 8.8. Emit field CEM is provided for storing 


TABLE 8.5 Field CMB 


CODE MICRO-OPERATION 
000 No operation 
001 A<MB, 

010 ВМВ, 

011 ІС-МВ 

100 STAT<—MB, 
101 OUT=MB, 
110 Not used 


111 


TABLE 8.6 Field CAB 


CODE MICRO-OPERATION 
00 H(6)——0, 
01 H(6)——1, 
10 H(6)<-STAT(0), 
11 H(6).—STAT(Q), 


TABLE 8.7 Field CBB 


CODE MICRO-OPERATION 
00 H(7)——0, 
01 H(7)<-1, 
10 Н(7)-5ТАТ(1), 
11 Н(7)<В(0), 


TABLE 8.8 Field CST 


CODE MICRO-OPERATIONS 
00 No operation, 
01 IF (ZT=0) THEN (STAT(O)<—0) ELSE (5ТАТ(0)<-1), 
10 5ТАТ(1)-МВ(0), - 
11 IF (ZT=0) THEN (5ТАТ(0)<—0) ELSE (5ТАТ(0)<- 1), 
5ТАТ(1)-МВ(0) 


ы ызы MMC es 


346 Chap. 8 CONTROL ORGANIZATION 


i 
2 3 2 2 3 2 2 2 6 8 


Fig. 8.6 Control word fields of the microprogrammed CPU 


an 8-bit constant. These fields are shown in Fig. 8.6 and the format is summarized in 
Table 8.9. 


TABLE 8.9 Control Word Format of the 
Microprogrammed CPU 


FIELD BITS DESIGNATION 

CA 0-1 Code for registers connected to bus L 
CB 2-4 Code for registers connected to bus R 
COP 5-6 Op-code for the ALU 

CSH 7-8 Op-code for the shifter 

CMB 9-11 Code for registers connected to bus MB 
CAB 12-13 Brach code for bit H(6) 

CBB 14-15 Branch code for bit H(7) 

CST 16-17 Code for STAT register 

CNA 18-23 Next-control-word address 

CEM 24-31 Emit field 


The microprogram control configuration consists of control memory CM, 
address register H, control-word register F, a three-phase clock, and eight decoders 
for decoding eight coded fields of the control word. This configuration is described 
below. 


Comment, microprogram control configuration (8.17) 
Memory, CM(H)—CM(0-255,0-31), S$control memory 
Register, H(0-7), $address register 
Е(0-31), $control word register 

Subregister, Ғ(СМА)--Е(18-23), 

Clock, Р(0-2), 

Decoder, KA(0-3)= F(0-1), $decode CA field 
KB(0-4)— F(2-4), $decode CB field 
KOP(0-1)— F(5-6), $decode COP field 
KSH(0-3)—F(7-8), 7 $decode CSH field 
КМВ(0-5)= F(9-1 1), $decode CMB field 


KAB(0-3)— F(12-13), $decode CAB field 


Sec. 8.2 Microprogram Control Organization 347 


КВВ(0-3)--Е(14-15), $decode CBB field 
KST(0-3)— F(16-17), $decode CST field 


To illustrate the description and preparation of a microprogram, consider a 
sequence which compares the numbers in registers A and B, assuming that these 
numbers are in the signed 2's complement representation. If the number in register 
А is larger than the number in register B, there is no change in instruction register 
IC. If the two numbers are equal, register IC is incremented by one. If the number in 
register A is smaller than that in register B, register IC is incremented by two. This 
comparison is to be performed by subtracting the number in register B from the num- 
ber in register A, and then by examining the value on zero-tester terminal ZT and the 
value on main bus terminal МВ(0). If the value of ZT and МВ(0) is 10, 00, or 11, then 
the number in register A is larger than, equal to, or smaller than the number in register 
B (see Table 8.10). 


TABLE 8.10 Interpretation of Zero Test 


ZT MB(0) INTERPRETATION 
0 0 Sum 5 is 0 

0 1 Not possible 

1 0 Sum is positive 

1 1 Sum is negative 


A sequence chart for this branch sequence is shown in Fig. 8.7. There are three 
blocks, one for each micro-instruction. In each block, the upper part specifies the 
connect and transfer micro-operations; the lower part determines the next control- 
word address. The number above each block is the assigned address of the control 
word specified by the block. The block with address “yyyyyyyy” performs the above- 
described comparison; the subtraction is carried out by adding 2’s complement of 
the number in register B to the number in register A. The branching is achieved by 
obtaining the next control-word-address bits H(6, 7) from bits STAT (0, 1) which 
store the value of ZT and MB(0). The block with address “xxxxxx00” increments 
the contents of register IC by one, while the block with address “xxxxxx11” increments 
the contents of register ІС by two. No matter whether the result from the comparison 
is positive, zero, or negative, the sequence eventually reaches the control word at 
address “хххххх10.” 

The above three micro-instructions are now described by the following state- 
ments. Note that switch START and control register G are provided to start the 
sequence with a proper address in register H. 


Comment, a microprogram for a branch sequence (8.18) 
Comment, start the sequence 
/START(ON)*P(0) Е<-0, 6-0 


348 


Chap.8 CONTROL ORGANIZATION 
Entry 
уууууууу 
L=A, R=B, C(31)=1, 
A<MB, MB=Q, О-5, 
STAT(1}<MB(0), 
IF (2Т-0) THEN (STAT(O)<O) 
ELSE (STAT (0)<1), 
H(6, 7)<STAT(O, 1), 
H(6, 7)=11 H(6, 7)=00 
хххххх00 
L=2, В=1С, C(31)=0, L= R=IC, C(31)=1, 
ІС-МВ, МВ-О, О-5, ІС<-МВ, МО-О, O=S, 
Н(6, 7)<10, Н(6, 7)=10 
хххххх10 
Fig. 8.7 Sequence chart for а branch sequence 
/G'*P(1)/ Н«-“уууууууу”, 6—1 
/G*P(2)/ F—CM(H) 


Comment, micro-instruction for subtracting B from A located at yyyyyyyy (8.19) 


/KA(3)*P(0)/ L=A, 

/KB(1)*P(0)/ R=B, 

/KOP(1)*P(0)/ C(31)—1, 

/KSH(0)«P(0)/ MB=Q, Q=S, 

/KMB(1)*P(0)/ А<МВ - 

/KST(3)*P(0)/ IF (ZT=0) THEN (START(0).—0) ELSE (STAT(0)<—1), 
STAT(1)<—MB(0), 


/KAB(2)*P(1)/ H(6)<-STAT(0), 


Sec. 8.2 Microprogram Control Organization 349 


/KBB(2)*P(1)/ H(7).—STAT(1), 
/G*P(1)/ H(0-5)——F(CNA), ФЕ(СМА)--“хххххх” 
/G*P(2)/ F-——CM(H), 


Comment, micro-instruction for incrementing IC by 1, located at хххххх00 (8.20) 
/KA(0)«P(0)/ L=0, 


/KB(3)«P(0)/ R=IC, 

/KOP(1)*P(0)/ С(31)=1, 

/KSH(0)«P(0)/ MB=Q, Q=S, 

/KMB(3)*P(0)/ ІС< МВ, 

/KST(0)«P(0)/ 

/КАВ(0)*Р(1)/ H(6)—1, 

/KBB(1)*P(1)/ H(7)<0, 

/G*P(1)/ H(0-5)<-F(CNA), ФЕ(СМА)--“хххххх” 
/Gx*P(2)/ F«-—CM(H) 

Comment, micro-instruction for incrementing IC by 2, located at xxxxxx11 (8.21) 
/KA(1)*P(0)/ І(0-7,8-15)--0-Е(ЕМ), $F(EM)=2 
/KB(3)*P(0)/ R=IC, 

/KOP(0)*P(0)/ C(31)=0, 

/KSH(0)«P(0)/ MB=Q, О--5, 

/KMB(3)«P(0)/ ІС-МВ, 

/KST(0)«P(0)/ 

/KAB(0)«P(1)/ H(6)—1, 

/KBB(0)«P(1)/ H(7)-—0, 

/ K*P(1)/ H(0-5) —F(CNA), ФЕ(СМА)--“хххххх” 
/G*P(2)/ F-—CM(H) 


The timing of the above description is as follows. The signal flows from a register 
through the parallel adder and the shifter, and reaches the main bus during clock 
phase Р(0). The data transfer from main bus to a register as well as the address 
transfer to register H occur during the clock phase D(1). The control word is fetched 
from memory CM and arrives at register F during the clock phase P(2). 


8.2.6 A Microprogrammed 1/О Control Unit 


An I/O control unit not only operates one or more I/O devices, but also commu- 
nicates with the data channel of a computer through an I/O interface as shown in 


350 Chap. 8 CONTROL ORGANIZATION 


Interface 


Data сһаппе! МО control 1/0 device 


unit 


Fig. 8.8 1/0 data transmission 


Fig. 8.8. Although the I/O interface for a family of computers may be standardized, 
the I/O control units are different for different I/O devices. If the I/O control unit is 
microprogram-controlled, then the microprogram I/O control unit or MICU can be 
flexible enough to accommodate a wide variety of I/O devices. This section describes 
an MICU, reported by McGee and Petersen [8]. 

The MICU consists of four 8-bit data registers (operand registers КІ and R2, 
input register I, and output register O), three data buses (A, B, and C) and an arith- 
metic and logical unit (ALU), as shown in Fig. 8.9. Buses A and B provide the input 


Input Output 
device device 


Fig. 8.9 Partial configuration of the MICU 


data to the ALU, and bus C receives the output from the ALU. Any register connected 
to buses A and B may provide data to the ALU, and any register connected to bus C 
may receive data from the ALU. All these registers are connected to buses A, B, and 
C except register I, which is not connected to bus C, and register O which is not con- 
nected to bus A. The ALU is capable of performing simple arithmetic and logic 
operations on the input data such as add, subtract, logical-and, and logical-or. 


Sec. 8.2 Microprogram Control Organization 351 


The MICU has another four 8-bit registers, input-control register IN, output- 
control register OUT, status register STATUS, and device control register CON- 
TROL, as shown in Fig. 8.10. Register IN receives data from the channel and register 


External status 


To data 
channel 


Data-in 


ALU 
status 


Input Output 
device device 


i" П | STATUS wa jour 


CONTROL 


Data-out 


E 


Fig. 8.10 Configuration of the MICU 


OUT transmits data to the channel. Register CONTROL provides control signals 
to the input device or the output device. Each bit of register STATUS indicates 
certain internal or external status. A bit of register STATUS may be set or reset by 
an external source, such as the channel, to signify that the data has arrived at register 
IN. Also, it may be set or reset to store an internal status such as a certain ALU 
status or the status of a bit in the micro-instruction. 

The MICU has a microprogram control part which is similar to the MCU 


352 Chap. 8 CONTROL ORGANIZATION 


shown in Fig. 8.2. The microprogram control part contains control memory CM, 
address register H, and control-word register F. The control memory has a capacity 
of 256 45-bit words. In addition, there are eight input data lines D-IN, eight output 
data lines D-OUT, and eight external status lines E-STA. The configuration of the 
MICU described above is now described below: 


Comment, configuration of the MICU (8.22) 


Register, R1(0-7), Форегапа register 
R2(0-7), Soperand register 
(0-7), $input register 
O(0-7), $output register 
IN(0-7), $input-control register 
OUT(0-7), $output-control register 
STATUS(0-7), $status register 
CONTROL(0-7), $device control register 
H(0-7), $address register 
Е(0-44), $control word register 

Bus, A, B, C, $bus A, B, C 

Terminal, D-IN(0-7), $data-in lines 
D-OUT(0-7), $data-out lines 
E-STA(0-7), Sexternal status lines 


Memory, CM(H)=CM(0-255,0-44), $control memory 


The control word has 45 bits which are divided into 10 fields as shown in Fig. 
8.11. The number of bits and the designation of each field are shown in Table 8.11. 


Гоа Га Т1 [а [е] 
7 3 2 4 3 3 3 8 4 8 
Fig. 8.11 Control word fields of the MICU 


The first three fields, CN, CL, and CJ, determine the next-control-word address. 
Field CN gives the seven high-order bits of the address, and field CL gives the condi- 
tion code which determines the low-order bit of the address as described previously. 
Field CJ specifies whether the next-control-word address is obtained from fields CN 
and CL or from some other branch source such as register КІ. The use of these three 
fields allows the specification of the next control-word address in every micro-instruc- 
tion together with a possible two-way branching. 

Fields CA, CB, CC, CK, and COP specify a micro-operation by the ALU. 


Sec. 8.3 Central Control Organization 353 


TABLE 8.11 Control Word Format of the MICU 


FIELD Birs DESIGNATION 
Ее: 2 

CN 0-6 High-order bits of the NCWA 

CL 7-9 Branch condition code for the NCWAt 

CJ 10-11 Branch source for the NCWAt 

COP 12-15 Op-code for the ALU? 

CA 16-18 Code for registers connected to bus A 

CB 19-21 Code for registers connected to bus В 

CC 22-24 Code for registers connected to bus C 

CK 25-32 Emit field 

CS 33-36 Bit address of the status register 

CEX 37-44 External code or constant 


TNCWA means next-control-word address 
IALU means the arithmetic and logical unit 


Fields CA and CB specify two registers to provide input data to the ALU, and field 
CC specifies one register to receive the output from the ALU. Field CK is an emit 
field which may be used to provide a constant as one of the two inputs to the ALU. 
Field COP specifies the operation to be done by the ALU. Field CS causes a 1 or 0 
to be set into a certain STATUS-register bit according to a chosen condition; this 
condition may occur during the execution of the micro-instruction. Field CEX 
provides eight control bits which can be used as an external code or as an external 
constant. For example, if field CEX contains a code of 00001000, the micro-instruc- 
tion could cause the acceptance of an 8-bit input data into register IN. In short, a 
single micro-instruction selects the next control-word address, performs an arithmetic 
or a logical operation, sets a bit in the status register, and sends a code or a constant 
to external lines. 


8.3 Central Control Organization 


When a stored-program computer operates, it continuously executes the control 
cycle. The simplest control cycle consists of two cycles, a fetch cycle (or instruction 
cycle) and an execution cycle; the CPU alternatively executes the fetch cycle and the 
execution cycle. In order to respond to unpredictable events, the CPU allows the 
control cycle to be interrupted after the current execution cycle is completed, as shown 
in Fig. 8.12. This section describes some aspects of central control organizations which 
implement the control cycle. 


8.3.1 Sequencing 


A stored-program computer is required to execute the sequence of instructions 
in a program stored in the main memory. The sequencing of the program is accom- 


354 Chap.8 CONTROL ORGANIZATION 


Interrupt? 


Fetch cycle 


Interrupt 
sequence 


Execution cycle 


Fig. 8.12 A CPU control cycle 


plished by controlling the next instruction-address. Furthermore, many recent com- 
puters (particularly multiprogramming computers) are also required to be capable 
of sequencing from one program to another (one program can be a supervisor pro- 
gram). This is often accomplished by means of the interrupt. In short, the sequencing 
is accomplished either by controlling the next instruction-address or by using the 
interrupt facility of the computer. 

The sequencing by controlling the next instruction-address is accomplished by 
(a) normal sequencing, (b) instruction branching, and (c) the use of the EXECUTE 
instruction. The manner by which the sequencing is carried out depends on the instruc- 
tion format. The most commonly-used instruction format has been the one which 
has an op-code (operation code) and an operand address (or an instruction address 
in case of a branch instruction), shown in Fig. 8.13. Consider the simple control 


0 34 15 


Fig. 8.13 Instruction format 


configuration shown in Fig. 8.14, where the above mentioned instruction format is 
used. Normal sequencing fetches the next instruction located at the next larger address 
of the current-instruction location. This is exemplified below. 


Comment, a sequencing organization (8.23) 
Register, AR(0-14), Saddress register 


Sec. 8.3 Central Control Organization 355 


Метогу М 


Fig. 8.14 A control organization 


D(0-35) $buffer register 

PC(0-14), Sprogram counter 

Е(0-8), $op-code register 

IR(O-3),  $interrupt-request register 

IRPT, S$interrupt-sequence control register 
Memory, М(АК)=М(0-32767,0-35), 


Comment, normal sequencing (8.24) 
/START/ IF (1650) THEN (GOTO INTERPT); 
AR<PC; 


D<«-M(AR), РС соипшр PC; 
Е-Г(0-8), AR-—D(21-35); 
execution cycle; 
GOTO START; 

/INTERPT/ IRPT<1; 


interrupt sequence 


GOTO START; 


The statement above describes a fetch cycle. Register IR is first tested to see if there 
is any interrupt request. If there is one or more, the interrupt sequence is initiated by 
having register IRPT set to 1. If there is none, the fetch cycle begins by having the 


356 Chap. 8 CONTROL ORGANIZATION 


next instruction address in program counter PC transferred to address register AR. 
The next instruction is then read out of the memory; at the same time, the program 
counter is incremented by one to again give the next-instruction address. The op-code 
part and the operand-address part of the instruction now in buffer register D are 
transferred to registers Е and-AR, respectively. The fetch cycle is now completed. 

If the sequencing is branched by a branch instruction, the next instruction address 
is taken from the address field of the branch instruction instead of from the program 
counter. This is exemplified below. ` 


Comment, sequencing by a branch instruction. (8.25) 
AR<PC; 
D«—M(AR), PC<countup РС; 
F—D(0-8), АК«-Г(21-35); 
PC<—D(21-35); 


The fetch-cycle part of the above example remains the same (except the interrupt is 
omitted here). At the end of the fetch cycle, the next-instruction address is already 
in the address register; only the program counter needs to be updated. 

The EXECUTE instruction branches to the instruction located by the address 
in the address field of the EXECUTE instruction, executes it, and returns to the next 
instruction of the original program. It may be regarded as а one-instruction sub- 
routine. This is exemplified below. 


Comment, sequencing by the EXECUTE instruction (8.26) 
[БЕТСН/ AR<PC; 
D<—M(AR), PC<countup PC; $EXECUTE instruction in D 
Е<0(0-9), AR-—D(21-35); 
D«—M(AR); $branched instruction in D 
execution of the branched instruction; 
GOTO FETCH; 


The above exemplifies sequencing by controlling the next instruction address. 
Sequencing by using the interrupt facility is to be described subsequently. 


8.3.2 Addressing 


Memory addressing has been described and exemplified in Chapter 7. There 
have been detailed descriptions of immediate addressing, direct addressing, indirect 
addressing, indexed addressing, relative addressing, base addressing, and register 
addressing. It is an important characteristic of the computer organization. Fig. 8.15 
shows the flowchart of the control cycle with indexing and indirect addressing in the 


Sec. 8.3 Central Control Organization 357 


Interrupt? 
Fetch 
instruction 
Interrupt 


Index sequence 


operation 


Fetch 
operand 


Indirect 
addressing? 


Execution 
cycle 


Fig. 8.15 A CPU control cycle with indexing апа indirect 
addressing 


fetch cycle. A control organization which implements the fetch cycle with indexing 
and indirect addressing is exemplified below. 


Comment, a control organization with indexing and indirect addressing (8.27) 


Register, AR(0-14), $address register 
D(0-35), $buffer register 
PC(0-14), $program counter 


I, $indirect addressing flag 


358 Chap. 8 CONTROL ORGANIZATION 


AC(0—35), $accumulator 
Array-register, ХЕ(1-3,0-14), $index registers 
Subregister, D(ADDR)=D(21-35), $instr. address field 
Memory, М(АЕ)--М(0-32677,0-35), 
Comment, fetch the instruction (8.28) 
AR<PC; $transfer next instruction address to 
AR 
D<—M(AR); $read instruction out of memory 
Comment, indexing (8.29) 


IF (D(18-20)—0) THEN (AR-—D(ADDR)), 

IF (D(18-20)=1) THEN (AR-—D(ADDR) add XR(1,)), 
IF (D(18-20)—2) THEN (AR-—D(ADDR) add XR(2,)), 
IF (D(18-20)=3) THEN (AR-—D(ADDR) add XR(3,)), 


OP —D(0-11), $store op-code 

I-—D(12); $store indirect addressing bit 

D-—M(AR); $read operand out of the memory 
Comment, indirect addressing (8.30) 
[Xl IF (І--1) THEN (AR<-D(ADDR)) ELSE (GOTO Y); 


IF (I=1) THEN (D—M(AB)); 
IF (I=1) THEN (I—D(12, GOTO X); 
IY] АСР; 


The addressing technique described in the dynamic loader of Chapter 7 may be 
called the dynamic addressing, because the actual memory address of an instruction 
or an operand ts not known until the time of execution. During the execution, the 
memory address is interpreted from the table stored in the associative memory and 
then transferred to the memory address register for accessing the instruction or the 
data word. 

If dynamic addressing is used in a control cycle, the flowchart of the control 
cycle may appear as that shown in Fig. 8.16, where there are two new operations, 
the operand address interpretation and the instruction address interpretation. The 
operand address interpretation is the procedure whereby the actual operand address 
is obtained by using page address Z in the associative memory as described in section 
7.5.4 on operand address fetch. The instruction address interpretation is the procedure 
whereby the actual instruction address is obtained by using page address X in the 
associative memory as described in section 7.5.3 on instruction sequencing. 


Sec. 8.3 Central Control Organization 359 


Interrupt? 
Fetch 
instruction 


Instruction has 
an operand address? 
Indexing? 


Index 


operation 


Operand address 
interpretation 
Indirect 
addressing? 


Interrupt 
sequence 


Execution 
cycle 
Instruction address 
interpretation 


Fig. 8.16 A CPU control cycle with dynamic addressing 


8.3.3 Priority Interrupt 


When a digital computer is operating, certain rare events or exceptional condi- 
tions such as an addition overflow, a parity error, a request from an I/O operation, 
or a signal from a real-time input may occur. These events and conditions are infre- 


360 Сһар. 8 CONTROL ORGANIZATION 


quent and unpredictable. However, whenever they happen, they must be attended 
to by the CPU as quickly as possible. In the early days, a program loop was used for 
testing and waiting for the event to occur; this resulted in a great inefficiency of hard- 
ware utilization. Today, the interrupt (also known as trap) is commonly used in 
most recent computers. When an interrupt occurs, the event is handled partly by 
hardware and partly by software. It is important that this time period of handling the 
interrupt be very short so that the next interrupt can be handled with little delay. 

The first use of interrupt on a large-scale digital computer was probably on the 
Univac Scientific 1103A at N.A.C.A., Cleveland, as reported by Mersel [1] in 1956. 
As described, the interrupt operates as follows. When an interrupt signal is sent to 
the Univac Scientific, the computer, immediately after finishing its present instruc- 
tion, notes the location of which instruction it would normally do next, and jumps 
to a fixed address which will allow the value of this location to be stored and from 
which the computer will find the information as to what it is to do because of its 
having been interrupted. 

Interrupt organizations of recent digital computers are becoming more sophisti- 
cated and more powerful than those of the early ones. Instead of one or a few, many 
are now allowed to be interrupt sources such as arithmetic overflows, illegal op-codes, 
inquiries from the terminals, machine malfunctioning, attention request from the 
timer, violation of memory protection, and completion of an I/O operation. Similar 
interrupt sources are usually grouped into the same classes; each class is given a pri- 
ority level according to importance and urgency. Simultaneous interrupts from 
different classes are accepted by the CPU according to their priorities; that is, the one 
with the highest priority is accepted first. It is desirable that these priority levels can 
be modified or changed by programming. A common practice is to assign a mask 
bit to each interrupt. When a particular interrupt is masked, it is ignored. Although 
interrupt from a higher priority-level may interrupt an interrupt being handled by 
the CPU, there are short time periods during which no interrupt should be allowed to 
disturb the CPU operation; this is accomplished by a single-bit enable-disable register. 

In order to better understand the operation of priority interrupt, a priority- 
interrupt organization is now presented. The configuration of the priority interrupt 
organization is shown in Fig. 8.17. As shown, there are six registers: input signal 
register IS which accepts the signals from the three interrupt lines ПМ at three priority 
levels; status register STATUS which accepts the signals from the three sets of inter- 
rupt status lines SSI, SS2, and SS3; interrupt mask register IM which masks the 
respective bits in register IS; priority interrupt register IP which stores the incoming 
interrupt with the highest priority; incoming interrupt-level register IC which stores 
the priority level of the incoming interrupt with the highest priority; and current 
interrupt-level register J which stores the priority level of the current interrupt now 
being handled by the CPU. There are four single-bit registers: interrupt-request 
register О which when 1 indicates the occurrence of one or more interrupt signals; 
interrupt-enable register EN which when 1 or 0 enables or disables the interrupt 
test subsuquence, respectively; and registers Е and I which when 1 indicate execution 
cycle and instruction cycle, respectively. In addition, there is a priority logic network 


Sec. 8.3 Central Control Organization 361 


Interrupt status lines Interrupt signal lines 
SS ———— 
o 


IN(Q) IN(1) IN(2) IN(3) соя 
3 во g 


e 
142) 


Oth level 


IS STATUS 


IR(O-3) 


Priority logic network 


Subtracter for testing 
greater-than 


B(S) 


Fig. 8.17 A priority interrupt configuration 


which selects from the incoming interrupts the one with the highest priority, an 
encoder which encodes the priority-selected interrupt into a binary number, a sub- 
tracter for testing if the number in register IC is greater than that in register J, and a 
three-phase clock. 

The above priority interrupt configuration is now described by the following 


CDL statements. 


Comment, priority interrupt organization (8.31) 
Register, IS(1-3), $interrupt signal register 
ІМ(1-3), $interrupt mask register (enabled when Г) 


ІР(1-3), $priority interrupt register 


362 


Сһар. 8 CONTROL ORGANIZATION 
ІС(0-1), $incoming interrupt level 
J(0-1), $current interrupt level 
EN, Sinterrupt enable register (when 1) 
Q, $interrupt request register (when 1) 
E, : $execution cycle register (when 1) 
I, Sinstruction cycle register (when 1) 
STATUS(1-3), Sinterrupt-source status register 
Clock, Р(0-2), 
Encoder, V(0-1)—IP, $епсоде interrupt with highest 
priority 
Comment, input lines (8.32) 
Terminal, IN(1-3), $interrupt input lines 
SS1(1-3), $interrupt status lines 
SS2(1-3), Sinterrupt status lines 
SS3(1-3), $interrupt status lines 
Comment, a logic network for identifying unmasked interrupts (8.33) 


Terminal, IR(1-3)*1M(1-3), 


Comment, a subtracter for testing incoming priority level to be greater (8.34) 
Terminal, B(S)=J(0)'*1C(0) 4-IC(0)«B(0) 4- B(0)* (0), 

B(O)=J(1)’*IC(1), (8.35) 
Block, STATUS-IN(STATUS(1-3)<IP(1)*SS1 + IP(2)*SS2-+- IP(3)*SS3), 


Comment, accept interrupt signal (8.36) 
/P(2)/ IS<IN, Senter interrupt 

signals 
Comment, interrupt test subsequence (8.37) 
/EN*E*P(0/ Q<IR(1)+IR(2)+IR(3), Senter interrupt request 


IF (IR(3)=1) THEN (IP—4) ELSE ` $priority determination 
(IF (IR(2)=1) THEN (IP<—2) ELSE 
(IF(IR(1)=1) THEN (IP—1) ELSE (IP—0)))), 


Sec. 8.3 Central Control Organization 363 


/О*Е*Р(1)/  IS—0, IC—V, DO STATUS-IN, 
/Q*E*P(2)/ | Q—0, IF (B(S)=1) THEN (І.е-1, ЕМ<-0), 


The terminals IR when | indicate the presence of unmasked interrupts. Terminal 
B(S) when 1 indicates that the encoded priority-level in register IC is greater than 
the priority-level in register J. Micro-operation STATUS-IN in the block statement 
selects one of the three sets of interrupt status lines according to the value in register 
IP (i.e., the interrupt with the highest priority-level). 

The statement (8.36) describes the incoming interrupt signals, which, if they 
occur, are accepted at every clock phase P(2). Statement (8.37) shows that the interrupt 
test subsequence normally occurs at every execution cycle, unless it is disabled by 
resetting register EN to 0. During the first clock phase, the occurrence of the interrupt 
signal or signals is indicated in register Q and, at the same time, the interrupt signal 
with the highest priority is selected by the priority logic network and enters into 
register IP. During the second phase, register 1$ is reset to 0 so that it can be ready 
to accept the incoming interrupt signals at the next clock phase; the encoded priority 
level of the selected interrupt enters into register IC; and micro-operation STATUS- 
IN 1$ performed to select and accept status of the selected interrupt. During the third 
phase, register Q is reset to 0; the interrupt sequence is initiated (by setting register 
L to 1) and the interrupt test subsequence is disabled if register IP does not contain 
0 and if terminal B(S) indicates a 1. 

In short, at the end of every execution cycle, the interrupt sequence is initiated 
if the interrupt has not been disabled and if the selected interrupt signal has a higher 
priority level than the current one. 


8.3.4 Interrupt Sequence 


The above priority-interrupt organization specifies that a priority level is given 
to each of the three interrupt lines. Let the supervisor program, the I/O requests, 
and the machine errors be the three classes of interrupt sources at priority levels 
1, 2, and 3, respectively. Let the user program be indicated by priority level 0. When 
an incoming interrupt is accepted, the memory address in the program counter for 
the current program is stored in one of the first four locations of the main memory, 
as shown in Table 8.12. The contents of the CPU registers should also be stored. 
These registers can be the accumulator, the MQ register, the index registers, the 
general-purpose registers, and the others. For simplicity, assume that only the accu- 
mulator needs to be stored. It is stored in one of the second four locations of the main 
memory, as shown in Table 8.12. The interrupt status of the three priority levels is 
stored in the main memory locations 8, 9 and 10. The first instructions of the three 
interrupt routines are stored in the main memory locations 13, 14, and 15, as also 
shown in Table 8.12. These routines are the software part that handles the interrupts. 

Figure 8.18 is the flowchart of the control cycle where the interrupt sequence is 
shown in more detail than that in Figs. 8.13 and 8.15. The interrupt sequence consists 
of the test subsequence, the anterior subsequence, and the posterior subsequence. 


364 Chap. 8 CONTROL ORGANIZATION 


Select unmasked 
interrupts 


Determine and accept the 
one with highest priority 
Interrupt 

test subsequence 
Encode the accepted 


interrupt Interrupt anterior 


subsequence 
Disable interrupt 
Store 
program counter 
Store 
accumulator 


Store 
interrupt status 


Incoming interrupt signal 


has a higher priority level? 


Fetch cycle 
Execution cycle 


Return instr. 


Interrupt Execution cycle for return 
posterior instr. which disables interrupt 
subsequence and decrements priority level Fetch 1st instr. of 


interrupt routine 
with a higher priority 


Priority active at one-level 
lower or priority at the 
lowest level 


Execute the 1st instr. 
(a transfer instr.) 


Enable interrupt 


Lower priority 
level by one 


Return to the lower-priority 


interrupt routine 


Enable interrupt 


Fig. 8.18 Flowchart showing the control cycle 


The test subsequence has been described by statements (8.37); this subsequence tests 
the incoming interrupt signals during every execution cycle, but it is disabled during 
the fetch cycle and the interrupt anterior and posterior subsequences. The interrupt 
anterior subsequence is initiated after it is determined that the incoming interrupt 
has a higher priority level than the current one being handled by the CPU. This 


Sec. 8.3 Central Control Organization 365 


subsequence disables the interrupt test subsequence. It stores the program counter, 
the accumulator, and the interrupt status. It then transfers to the interrupt routine 
with the highest priority. This routine is located by the address in an assigned area 
of the main memory (Table 8.12); at the same time, the interrupt test subsequence 
is enabled. Another interrupt may be accepted again as early as when the transfer 
to the interrupt routine is completed. 


TABLE 8.12 Main Memory Map 


ADDRESS MAIN MEMORY МАР 


© 


Program counter for the user program interrupt 


1 Program counter for the supervisor interrupt 
2 Program counter for the J/O interrupt 
3 Program counter for the machine-error interrupt 
4 Accumulator for the user program interrupt 
5 Accumulator for the supervisor interrupt 
6 Accumulator for the I/O interrupt 
7 Accumulator for the machine-error interrupt 
8 
9 Status for the supervisor interrupt 
10 Status for the I/O interrupt 
11 Status for the machine-error interrupt 
12 
13 Address for the supervisor interrupt routine 
14 Address for the I/O interrupt routine 
15 Address for the machine-error interrupt routine 


At this time, one of the interrupt routines takes over. With the interrupt status 
information, it carries out the functions that are programmed to handle the particular 
interrupt source. The last instruction of the interrupt routine should be a return 
instruction. The return instruction disables the interrupt test subsequence and initiates 
the interrupt posterior subsequence. The posterior subsequence tests whether the 
interrupt of the next lower priority level is active. If it is active, this interrupt with a 
lower priority level is handled. If it is not active, it continues to test until an active 
lower level or the lowest one (і.е., the user program) is reached. In either case, the 
program counter and the accumulator are restored. 

The organization which implements the interrupt sequence shown in Fig. 8.18 
is now described. It makes use of the priority interrupt organization described in 
statements (8.31), and the control] configuration shown in Fig. 8.14 in addition to 
local-control register L and its associated decoder. This organization is described 
by the following statements. 


Comment, interrupt sequence (8.38) 


Comment, central control organization 


366 Chap.8 CONTROL ORGANIZATION 


Register, AR(0-14), Saddress register 
D(0-35) | Sbuffer register 
РС(0-14), $program counter 
F(0—8), $op-code register 
L(0-2),-  $local control register 
AC(0-35), Saccumulator 


I, $fetch cycle when 1 

E, $execution cycle when 1 
Memory, M(AR)= М(0-32767,0-35), 
Decoder, K(1-7)—L, 


Comment, here begins the anterior subsequence. 

Comment, continuation from statements (8.37) 

Comment, store program counter (8.39) 
/K(1)*P(0)/ AR(0-12) —0, AR(I13-14)—J, 

/К(1)*Р(1)/ Г(0)--1, D(1-20)—0, 0(21-35)<-РС, 

/K(1)*P(2)/ M(AR)<D, L-—countup L, 


Comment, store accumulator (8.40) 
/K(2)*P(0)/ AR(0-12)<—1, AR(13-14)<J, 

/K(2)*P(1)/ р«-АС, 

/K(2)*P(2)/ M(AR)<D, L<countup L, 


Comment, store interrupt status (8.41) 
/K(3)*P(0)/ AR(0-12)<2, AR(13-14) —IC, 

/K(3)«P(1)/ D(0-27)——0, D(28-35).—STATUS, 

/K(3)*P(2)/ M(AR)-—D, L<countup L, 


Comment, transfer to the higher priority interrupt routine (8.42) 
/K(4)*P(0)/ JIC, AR(0-12).—3, AR(13-14)<IC, 

/K(4)*P(1)/ р«-М(АК), 

/K(4)*P(2)/ F—"TRA", AR—D(ADDR), E—I, EN—I, 1—0, 


Comment, execution cycle (transfer instruction) (8.43) 
/Е*ТКА»*Р(0)/ AR<PC, т 

/E*TRA*P(1)/ 

/E*TRA*P(2)/ IF (Q=0) THEN (I—1), Е<—0 > 


Sec. 8.3 Central Control Organization 367 


Comment, here is the end of the anterior subsequence 


Comment, fetch cycle (8.44) 
/1*Р(0)/ AR<PC, 

/I* P(1)/ D-«—M(AR), PC<—countup PC, 

П*Р(2)/ F-—D(0-8), I—0, E—1, 

Comment, the CPU executes an interrupt routine, allowing interrupt (8.45) 


with a higher priority 


Comment, at the end, the CPU executes a return instruction (8.46) 
Comment, here begins the posterior subsequence 
/E*RTN*P(2)/ ТЕ (J40) THEN (J-—countdn J), EN<—0, L-—5,E—0, 


Comment, test for returning to a lower-priority interrupt routine (8.47) 
/K(5)*P(0)/ AR(0-12)<—0, AR(13-14)<J, 
/K(5)*P(1)/ р«-М(АК), 
/K(5)*P(2)/ IF (D(0)=1)+(J=0)) THEN (D(0)<—0,L<—6) 
ELSE (J<-countdn J), 
Comment, return to a lower priority interrupt routine (8.48) 
Comment, restore program counter 
/K(6)*P(0)/ AR(0-12)—0, AR(13-14)<J, 
/K(6)*P(1)/ р«-М(АК), 
/K(6)*P(2)/ PC<D(21-35), L—countup 1, ЕМе-1, E—1, 
Comment, restore accumulator 
/K(7)*P(0)/ AR(0-12)—1, AR(13-14)<J, 
/K(7)*P(1)/ D—M(AR), 
/K(7)*PQ)/ АС Р, IF (Q=0) THEN (I—1, E—0), 
IF (B(S)=0) THEN (L—0), 


Comment, here ends the posterior subsequence 


In the above description, the anterior subsequence is described in statements 
(8.39)-(8.43). It takes five clock periods. The micro-operation of setting bit D(0) to 
1 as described in statements (8.39) is to note the particular interrupt level being 
active; this bit is then examined during the test in the posterior subsequence as 
described in statements (8.47). The locations where the program counter and the 
accumulator are stored are obtained from register J, as shown in statements (8.39) 


368 Chap. 8 CONTROL ORGANIZATION 


and (8.40). The locations where the interrupt status and the first instruction of the 
interrupt routine with higher priority are stored are obtained from register IC, as 
described in statements (8.41) and (8.42). Terminal TRA is the command for the 
transfer instruction. 

The anterior subsequence is followed by the execution of an interrupt routine. 
During this execution, interrupt by another interrupt with a higher priority level is 
allowed, because the interrupt test subsequence is enabled at the end of every fetch 
cycle. In this way, the following execution cycle and the interrupt test occur simul- 
taneously during the next clock period, and the test for an interrupt at a higher- 
priority level is then made at the end of the execution cycle. The last instruction of 
the interrupt routine must be a return instruction; during its execution, the interrupt 
test subsequence is disabled, register J either remains 0 if it is 0 or is decremented by 
1 if it is not 0, and the posterior subsequence is initiated. 

The posterior subsequence is described in statements (8.47) and (8.48). Its func- 
tions are to look for an active interrupt with a lower priority level from the first four 
locations in the main memory, and then to transfer to the interrupt routine of that 
level. Notice that statements (8.47) form a loop as long as bit D(0) is not 1 and register 
J is not 0 and that terminal RTN is the command for the return instruction. 


8.4 Asynchronous Control Organization 


A modern digital computer is interconnected from a number of system units. 
Interfaces are provided for the control and data transmission among the units. Each 
system unit has its own control and its own operations independent of other units. 
The operation of such an organization is called asynchronous. 

To illustrate this asynchronous operation among system units, a simple stored- 
program computer is now described. This computer consists of three system units: 
central processing unit (or CPU), memory unit (or MU), and input output unit (or 
IOU). The memory word-length is 16 bits. As a number, it is in the signed 2’s com- 
plement representation. As an instruction, it has two fields: a 4-bit op-code field and 
a 12-bit address field. The instruction format is shown in Fig. 8.12. There are eight 


TABLE 8.13 Instruction Set 


INSTRUCTION Op-copet SYMBOLIC CODE 
Add 06 ADD 
Subtract 05 SUB 
Load 02 LDA 
Store 01 STO 
Jump 12 JMP 
Jump-on-minus 11 JOM 
Input ~ 15 IN 
Output 16 OUT 


TIn octal 


Sec. 8.4 Asynchronous Control Organization 369 


instructions: ADD, SUB, LOAD, STORE, JUMP, JUMP-ON-MINUS, INPUT, 
and OUTPUT. Table 8.13 shows the op-codes of these instructions. 


8.4.1 Memory Unit 


The memory unit contains main memory M with 4096 16-bit words, buffer 
register RZ, address register HZ, and memory control register E. In addition, there 
are three single-bit registers N, XA, and XB. Register N indicates a read operation 
when it is 1; otherwise, it indicates a write operation. Register XA, when 1, indicates 
the request to main memory М by the CPU. Register XB, when 1, indicates the request 
to main memory M by the IOU. For simplicity, the cycle time of the main memory 
is assumed to be equal to the clock cycle time. (The case that they are not equal is 
left as a problem.) These registers and other computer elements of the memory unit 
are now described below. 


Comment, configuration of the MU (8.49) 
Register, X RZ(0-15), Sbuffer register 
HZ(0-11), Saddress register 


E(0-3), $memory control register 

N, $read operation when 0; else, write operation 
XA, $memory request by CPU when 1 

XB, $memory request by IOU when 1 


Decoder, KE(0-15)—E, 

Switch, START(ON), 

Memory, М(Н2)-- М(0-4095,0-15), 
Clock, P, 


The registers in the memory unit together with those in the two other units are shown 
in Fig. 8.19. 

The sequential operation of the MU is shown in Fig. 8.20. There are two se- 
quences which are controlled by registers XA and XB. When the MU is started, 
these two registers are reset to 0. Register XB is first examined; if it is 1, it indicates 
the IOU requested sequence. If it is 0, register XA is then examined. If register XA 
contains a 1, it indicates the CPU requested sequence. If register XA contains 0, it 
returns to examine register XB again and a wait loop appears. The order of examining 
register XB first and register XA second establishes the higher priority for the IOU 
requested sequence than for the CPU requested sequence. 

The JOU requested sequence is shown on the left side of Fig. 8.20. It first acknowl- 
edges the request by resetting register XB to 0. It transfers the contents of register 
WB in the IOU to register N and the contents of register HB in the IOU to address 
register HZ. If register N contains a 1, a memory write operation is required. In this 


370 Chap.8 CONTROL ORGANIZATION 


Memory Unit 


Main memory 


storage 
memory L 


Central Processing Unit 


Input Output Unit 


Fig. 8.19 Registers and memories of the memory unit, central 
processing unit and input unit 


case, the contents in subregister B(BUF) in the JOU are transferred to the buffer 
register RZ, and the contents of buffer register RZ are then stored into memory M. 
If register N contains a 0, a memory read operation is required. In this case, a word 
is read out of memory M into buffer register RZ, and the contents of register RZ is 
then transferred to subregister B(BUF) in the IOU. No matter whether it is a read 
or a write operation, register YB in the IOU is now set io 1 to inform the IOU that 
the requested memory operation is completed. The IOU returns to the wait loop. 
The CPU requested sequence is shown on the right side of Fig. 8.20. It acknowl- 
edges the request by resetting register XA to 0 and transfers the contents of registers 


Sec. 8.4 Asynchronous Control Organization 371 


START(ON) 


ХА<0, 
XB <0, 


RZ-B(BUF) 
M(HZ)-RZ 


RZ<—M(HZ) 
B(BUF)<RZ 


Fig. 8.20 Sequence chart for the memory unit 


М(Н2)<А 2 
RZ<M(HZ) 


WA and HA in the CPU to registers N and HZ, respectively. Again, the operation 
is read if register N is 0; otherwise, it is write. In either case, the read or write opera- 
tion is similar to those which occur during the IOU requested sequence. Register 
YA in the CPU is now set to 1 to inform the IOU that the requested memory opera- 
tion is completed. Then, the IOU again returns to the wait loop. 

The above MU operation, as shown in the sequence chart in Fig. 8.20, is now 
described by the CDL statements below. 


372 


Chap. 8 CONTROL ORGANIZATION 


Comment, the MU Operation. IOU has a higher priority than CPU. (8.50) 
/START(ON)/ XA<0, ХВ<-0, E—0, 


/КЕ(0)*Р/ 


IF (ХВ--1) THEN (Е<—1) ELSE (Е<-8), 


/Comment, IOU requested sequence 


/KE(1)*P/ 
/KE(3)*P/ 


/KE(2)*P/ 
/KE(6)*P/ 
/KE(7)*P/ 
/KE(5)*P/ 
/KE(4)*P/ 


XB—0, N—WB, E<33, 

HZ-—HB, 

IF (N=0) THEN (E<7) ELSE (Е<-2), 
RZ<B(BUF), E—6, 

M(HZ)—RZ, E<4, 

RZ—M(H2, Е<5, 

B(BUF)—RZ, Е<4, 

ҮВе-1, E—0, 


Comment, CPU requested sequence 


/KE(8)«P/ 
/KE(9)*P/ 
/KE(11)*P/ 


/KE(10)«P/ 
/KE(14)*P/ 
/KE(15)*P/ 
/KE(13)*P/ 
/KE(12)*P/ 


IF (ХА--0) THEN (E<0) ELSE (E—9), 
XA-—0, ММА, E—11, 

HZ-—HA, 

IF (N=0) THEN (E-—15) ELSE (Е<-10), 
RZ-R, E-—14, 

M(HZ)—RZ, E—12, 

RZ<—M(HZ), E—13, 

R—RZ, Е<12, 

ҮА<-І, E<0, 

END 


8.4.2 Input Output Unit 


Main memory M in the MU is divided into 64 blocks of 64 16-bit words. Each 
block is called a page; thus, memory M has 64 pages. The IOU has a large-storage 
memory L which contains 1024 1024-bit words. The size of each 1024-bit word 
matches the size of a page. The transfer between main memory M and large-storage 
memory L is in one page at a time. А page transfer requires two addresses: а 12- 
bit-address for memory L and a 6-bit page-address for memory M. These two 
addresses constitute a 16-bit word stored in main memory M and pointed by the 
address in the address field of the input or. output instruction. In short, an input or 
an output transfer requires an instruction and an associated address word. 

The IOU has buffer register B, address register HB, page-address register J, 
address register G, control register S, and counter C. In addition, there are 4 single- 


Sec. 8.4 Asynchronous Control Organization 373 


bit registers, WB, YB, Q, and Z. When register WB is 0, it indicates a read operation 
in main memory; otherwise, it indicates a write operation. When register YB contains 
1, it indicates that the requested memory operation is completed. Register О, when 1, 
indicates that the IOU is busy; otherwise, it is free. Register Z, when 1, indicates 
that an output sequence is requested by the CPU; otherwise, an input sequence is 
requested. For simplicity, the cycle time of the large-storage memory is assumed to 
be equal to the clock cycle time. (The case when they are not equal is left as a problem.) 

These registers and other computer elements of the IOU are now described 
below. 


Comment, configuration of the IOU (8.51) 
Register, HB(0-11), Saddress register 

WB, $read operation when 0, else write operation 

YB, $memory operation complete when 1 

S(0-3), $IOU control register 

Q, $IOU busy when 1; else, free 

Z, $output sequence when 1; else, input sequence 


B(0-1023), $buffer register 
J(0-5), $page address register 
C(0-5), $counter 
G(0-9), $address register 
Subregister, B(BUF)=B(0-15), 
Decoder, KS(0-15)=S, 
Memory, L(G)=L(0-1023,0-1023), 
Clock, P, 


The sequential operation of the IOU is shown in Fig. 8.21. The IOU is free or 
busy when register Q is 0 or 1, respectively. When the IOU is started, registers Q and 
YB are both reset to 0. Register Q is next examined to see if it is 1. If it is not, it is 
continuously examined until it is 1. This repeated examination is indicated in Fig. 
8.21 by a wait loop. Whenever it becomes 1, counter С is set to 0. An input or an 
output sequence now begins. | 

When register 7, contains а 0, it is the input sequence. It begins by reading a 
word out of large-storage memory L located by address register G into buffer register 
B. Casregister J-C stores the main-memory address at the location to which the 
current 16 bits in subregister B(BUF) are to be transferred. This address is transferred 
from casregister J-C to address register HB. Register WB is set to ] to indicate a 
read operation, and a request to main memory M is then made by setting register 
XB to 1 if XB is 0. The IOU enters a wait loop if XB is not 0. If XB is 0, IOU enters 
another wait loop, examining register УВ until УВ is 1. When YB is ], it indicates 


START(ON) 


YB-0, 
C<countup C 


L(G)<B, 


Fig. 8.21 Sequence chart for the input output unit 


374 


Sec. 8.4 Asynchronous Control Organization 375 


that the requested memory operation is completed. The IOU acknowledges it by 
resetting register YB to 0. Counter C is next incremented by 1. Register B is circularly 
shifted 16 bit-positions to the left. Counter C is then tested for 0. If it is not 0, it 
repeats the transfer of the main-memory address in casregister J-G to register HB, 
the setting of registers WB and XB both to 1, and so forth, until counter C reaches 
63. At this time, the 1024-bit word in register B has been transferred as a page to main 
memory M. Note that the first location of the page in main memory M is provided 
by page address register J and the "local addresses" are provided by counter C. The 
transfer from large-storage memory L to main memory M is now completed. Register 
Q is reset to O to indicate that the IOU is now free, and the IOU returns to the wait 
loop until register Q contains 1 again. 

When register Z contains a 1, it is the output sequence. А page of 64 words is 
read out of main memory M into buffer register B, and is then stored into large- 
storage memory L. As shown in Fig. 8.21, the micro-operations to achieve this data 
transfer are similar to those in the input sequence. When this data transfer is accom- 
plished, register О is again reset to 0 to indicate that the JOU is free. The ТОЧ returns 
to the wait loop until register О contains 1 again. 

The above IOU operation, as shown in the sequence chart in Fig. 8.21, is now 
described by the CDL statements below. 


Comment, the IOU Operation (8.52) 
/START(ON)/ Q-—90, YB<0, S—8, 
/KS(8)«P/ IF(Q-—0) THEN (S—10), 


JKS(10«P/ ^ C-90, 
IF (Z—0) THEN (S<6) ELSE (S—14), 


Comment, input sequence 


/KS(6)«P/ B<L(G), S—7, 

/KS(7)*P/ HB<J-C, WB-1, S<2, 
/KSQ)*P/ IF (XB=0) THEN (XB<-1,S<3), 
/KS(3)*P/ IF (YB=1) THEN (S—1), 
/KS(1)«P/ YB—0, C—countup C, S<—5, 
/KS(5)*P/ B<16 cil B, 


IF (C463) THEN (S<7) ELSE (S<12), 
Comment, output sequence 
/KS(14)*P/ HB<J-C, УУВ<-0, 5<-4, 
/К$(4)*Р/ ТЕ (XB=0) THEN (ХВ-<-1, S<15), 
/KS(15)«P/ IF (YB=1) THEN (5<-11), 
/KS(11)«P/ УВ<0, Cecountup C, S<9, 


376 Chap. 8 CONTROL ORGANIZATION 


/KS(9)«P/ В<16 cil B, 

IF (C463) THEN (S<14) ELSE (S—13), 
/KS(13)*P/ L(G)-—B, S—12, 
/KS(12)«P/ Q—0, S—8, 

END ^ 


8.4.3 Central Processing Unit 


The central processing unit or CPU consists of accumulator A, buffer register 
R, address register HA, program register D, counter C, and control register F. In 
addition, there are two single-bit registers WA and YA. Register WA, when 0, indicates 
a read operation; otherwise, it indicates a write operation. Register YA, when 1, 
indicates that the requested memory operation is completed. These registers are shown 
in Fig. 8.19 and the other CPU elements are described by the following declaration 
statements: 

Comment, configuration of the CPU (8.53) 

Register, A(0-15), | $accumulator 

C(0—5), $counter 

R(0-15), | Sbuffer register 

HA(0-11), S$address register 

D(0-11), — $ргоргат register 

Е(0-4), $CPU control register 

WA, $read operation when 0; else, write operation 

YA, $memory request by CPU is complete when 1 
Subregister, Е(ОР)--Е(1-4), 

R(OP)=R(0-3), 

R(ADDR)-R(4-15), 

Decoder, KF(0-31)—F, 

Switch, START(ON), 

Clock, P. 

The sequential operation of the CPU 1$ shown in Fig. 8.22. After being initialized 
by the START switch, the CPU proceeds to the fetch sequence and 1$ then branched 
to a particular sequence according to the op-code decoded during the fetch sequence. 
Whenever the CPU makes a memory request during the fetch sequence or an execu- 
tion sequence, it enters a wait loop during which time register ХА 1s continuously 


examined. When register УА contains a 1, it indicates that the requested memory 
operation is completed; the CPU resumes its operation. 


START (ON) 


— = 
УА< 0, 
А<А ада В 
Fig. 8.22 Sequence chart for the central processing unit 


377 


378 Chap. 8 CONTROL ORGANIZATION 


As shown in Fig. 8.22, when the START switch is turned ON, registers D and 
YA are reset to 0. The fetch cycle begins by transferring the contents of register D 
to address register HA. It then resets register WA to 0 to indicate a read operation 
and makes a memory request for an instruction by setting register XA in the MU 
to 1 if XA is 0. If XA is not 0, the CPU waits until it is 0. The CPU now enters the 
memory wait loop. When register YA is found to be 1, the CPU acknowledges it by 
resetting register YA to 0 and then decrementing register D by 1. The contents of 
subregisters R(OP) and R(ADDR) which are the op-code and the operand addresses, 
respectively, are now transferred to subregister F(OP) and register HA, respectively. 
At this time, the op-code which specifies a particular execution sequence is in control 
register F, and the operand address is in address register HA. The fetch cycle is now 
completed. Note that the first instruction is fetched from the first location of memory 
M because register D is initially reset to 0. 

If the op-code is octal 15 or 16, bit F(4) indicates the input sequence when it is 
0 or the output sequence when it is 1. The CPU enters the wait loop, continuously 
examining register Q. When register Q contains a 0, it indicates that the IOU is free; 
otherwise it is busy. When the IOU is free, the contents of bit F(4) are transferred to 
register Z in the IOU. The CPU requests the MU by setting register XA in the MU 
to | if XA is 0 and indicates a read operation by resetting register WA to 0. The 
IOU now enters the memory wait loop. When register YA becomes 1, the CPU 
acknowledges it by resetting YA to 0 and transfers the contents of register R to 
casregister J-G in the IOU. The ТОЧ is now initiated by setting register Q to 1 if Q 
is 0. If Q is not 0, the CPU waits until Q becomes 0. The input or output sequence is 
now initiated and the CPU proceeds to the fetch sequence. The sequences for other 
instructions are similar. 

The above CPU operation is partially shown in Fig. 8.22. It is now described 
in its entirety by the following CDL execution statements: 


Comment, the CPU Operation (8.54) 
/START(ON)/ D0, YA—0, F—17, 

Comment, fetch sequence 

/KF(17)«P/ НА <р”, WA-—90, F—8, 

/KF(8)#P/ IF (XA=0) THEN (XA<1, F—16), 

/KF(16)«P/ IF (YA=1) THEN (F<), 

ІКЕ(О)«Р/ D<countup D, YA-—0, Е<4, 

/КЕ(4)*Р/ Е(ОР)<К(ОР), НА-В(АРОВ), 


Comment, input-output sequence 

/(KF(13)2-KF(14)*P/ Z«-—F(4), Е<-12, 

/KF(12)«P/ IF (Q—0) THEN (Ғ<-30), 
/KF(30)«P/ WA <0, F—28, 

/KF(28)«P/ IF (XA=0) THEN (ХА -—1, Е<31), 


Sec. 8.4 Asynchronous Control Organization 379 


/KF(31)*P/ IF (УА=1) THEN (F<23), 
/KF(23)«P/ J-G—R, YA—0, F-—28, 
/KF(28)«P/ IF (Q—0) THEN (Q—1, F—17), 
Comment, add sequence 
/KF(6)«P/ УА <0, F—15, 
/KF(15)«P/ IF (XA=0) THEN (XA —1, F-—22), 
/KF(22)*P/ IF (YA=1) THEN (F<20), 
/KF(20)«P/ A<A add В, YA—0, F—17, 
Comment, subtract sequence 
/KF(S)*P/ WA<0, F—21, 
/KF(21)«*P/ IF (XA=0) THEN (XA —1, F<7), 
/KF(7)*P/ IF (YA=1) THEN (F<19), 
/KF(19)*P/ A<A sub В, YAO, F—17, 
Comment, load sequence 
/KF(2)*P/ WA, F2—4, 
/KF(24)*P/ IF (XA=0) THEN (XA—1, F—18), 
/KF(18)«P/ IF (УА=1) THEN (F25), 
/KF(25)«P/ А <К, YA—0, F—17, 
Comment, store sequence 
/KF(1)«P/ КА, WA-—1, F—14, 
/KF(14)«P/ IF (XA=0) THEN (XA —1, F——3), 
/KF(3)«P/ IF (YA=1) THEN (F<27), 
/KF(27)«P/ УА<-0, F—17, 
Comment, jump-on-minus sequence 
/KF(9)«P/ IF (А(0)--1) THEN (F—10) ELSE (F<17), 
Comment, jump sequence 
/KF(10)«P/ D<R(ADDR), F-—17, 

END 


8.4.4 Interfaces 


Statement (8.54) describes three independent system units, each having its own 
control. Thus, there are three control cycles, operating independently and asynchro- 
nously. The MU constantly serves the CPU and the IOU. The IOU, once initiated by 


380 Chap.8 CONTROL ORGANIZATION 


the CPU, is left to execute an input or an output sequence. At the same time, the 
CPU executes the program stored in main memory. 

The three systems units communicate with each other by means of interfaces. 
There are three interfaces: one between the CPU and the JOU, another between the 
MU and the CPU, and the third between the MU and the IOU. These interfaces are 
lines for address, data, status, and control required by the micro-operations described 
above. These interfaces are shown in Fig. 8.23. If the clock is shared, there are addi- 
tional clock or timing lines. There are 19 lines between the CPU and the IOU. Two 
are control lines due to the CPU micro-operations for requesting an I/O sequence 


Control line © 
Control line 6% 


Address lines 


Data lines © 
PE 
oT 
EE TEOD 
fro n 
Q 
ED 


р 2 
© 


Fig. 8.23 Interfaces among the system units of а stored- 
program computer à 


Sec. 8.4 Asynchronous Control Organization 381 


and for indicating whether the requested sequence is input or output, 
О<1 and Z<F(4), 

One is a status line due to the following condition tested at the CPU, 
Q=0. 


The remaining 16 lines are address lines due to the following CPU micro-operation 
for transferring the address word from the CPU to the IOU, 


J-G—R. 


There are 48 lines between the CPU and the MU in all: 12 address lines, 32 data 
lines, one status line, and three control lines. The three control lines come from the 
micro-operations for requesting memory status by the CPU, for indicating if the 
request is complete by the MU, and for indicating if the requested memory reference 
is read or write as follows: 


XA-—1, YA—1, and N-——WA, 
The status line is due to the following test condition, 
ХА=0 


The 12 address lines are due to the following micro-operation for transferring the 
memory address from the CPU to the MU, 


HZ<-HA, 


The data lines come from the micro-operations or transferring the data between the 
register R in the CPU and register RZ in the MU as follows, 


R<RZ, and RZ—R, 


Similarly, there are a total of 48 lines between the MU and the IOU. 
The above three interfaces are now ‘described by the following terminal state- 


ments. 


Comment, the interface between the CPU and the IOU (8.55) 
Terminal, R(0—5), $page address lines to IOU register J 

R(6-15), $word address lines to IOU register G 

F(4), $indicating input or output from CPU 

Q, ДОП status line from IOU 


IO-REQ, $I/O request line from CPU 


382 Chap. 8 CONTROL ORGANIZATION 


Comment, the interface between the MU and the CPU (8.56) 
Terminal, XA, $memory status line from MU 
MEM-REQ-A, $memory request line from CPU 
COMPLETE-A, $геѓегепсе complete lines from MU 


WA, ` $indicating read or write from CPU 

НА(0-11), $memory address lines from CPU 

К7(0-15), $data-out lines from MU | 

R(0-15), $data-in lines from CPU 
Comment, the interface between the MU and the IOU (8.57) 
Terminal, XB, $memory status line from MU 


MEM-REQ-B, $memory request line from CPU 
COMPLETE-B, $геѓегепсе complete line from MU 


WB, $indicating read or write from IOU 
HB(0-11), $memory address lines from IOU 
К7(0-15), $data-out lines from MU 

B(BUF), $data-in lines from IOU 


8.5 Computer System Configuration 


The computer system so far described has three system units; a practical com- 
puter system has more. Normally, it has additional system units of channel, control 
unit, and input-output device. This section introduces these system units and shows 
their participation to form a computer system. 


8.5.1 Channel 


А computer system with three system units, MU, CPU, and IOU, is described in 
the last section. The IOU contains, among others, a large storage memory L. If there 
is a need for two or more large storage memories, the hardware which performs the 
functions of commanding, buffering, transferring, counting, and control can be 
shared by these memories. To accomplish this sharing of hardware, the IOU is 
organized into two system units, as shown in the diagram in Fig. 8.24. One system 
unit consists essentially of the large storage memory L and is called the /arge storage 
unit or LSU, and the other system unit consists of the remaining elements and is called 
the channel or CH. If three large storage memories are to be connected to the com- 
puter system, and if only one of them is to be accessed at any one time, these three 
LSU’s are connected to the computer system through the channel in a manner shown 
in Fig. 8.25. In this section the channel organization for the system shown in Fig. 
8.24 is described. 


Sec. 8.5 Computer System Configuration 383 


Fig. 8.24 A computer system with a channel CH and a large 
storage unit LSU 


Fig. 8.25 A computer with a channel 1 CH and three large 
storage units, LSU1, LSU2, and LSU3 


The channel consists mainly of register HB for storing the MU address, register 
G for storing the LSU address, register J for storing the page address of the MU, 
data register B, counter C, and single-bit registers WB, YB, Q, and Z. These registers 
are the same as those described in statement (8.51) except that register B now has 
only 16 bits. They are shown in Fig. 8.26 and described below. 


Comment, configuration of the Channel (8.58) 
Register, HB(0-11), $register for the MU address 

В(0-15), даға register 

G(0-9), $register for the LSU address 

J(0—5), $page address register 

C(0—5), $counter 

5(0-3), $CH control register 


WB, $read when 0, else write 

YB, $memory operation complete when | 

Q, $CH busy when 1; else, free 

Z, $output sequence when 1; else, input sequence 


Subregister, В(ВОЕ)--В(0-15), 


384 Chap. 8 CONTROL ORGANIZATION 


НВ(0-5) | HB(6-11) 


Channel 


Large storage memory 
L 


МАН(0-9) Large Storage Unit 


Fig. 8.26 Configurations of the channel and the large storage 
unit 


Decoder, KS(0-15)=S, 
Clock, Р, 
Switch, START(ON), 


The large storage unit consists of the large storage memory L, address register 
MAR, and buffer register LB. In addition, there are register T and a three single- 
bit register RW, I, and V. Register T together with the clock sequences the LSU 
operation. Register RW indicates whether, the operation'is read (1.е., input) or write 
(1.е., output). Register I indicates whether the operation is a transfer of 16 bits or if 
it is a read/write operation. Register V initiates the LSU operation; the LSU is thus 
controlled by register V. These registers are shown in-Fig. 8.26 and described below. 


Sec. 8.5 Computer System Configuration 385 


Comment, configuration of the large storage unit (8.59) 
Register, LB(0-1023), $150 buffer register 
MAR(0-9), $150 address register 


T(0-1), $LSU control register 

RW, $read when 0; else, write 

V, $initiate LSU sequence 

I, Sindicate data transfer when 1; else, a LSU read/write 
operation 


Memory, L(MAR)=L(0-1023,0-1023), 
Decoder, КТ(0-3)--Т, 

Clock, P, 

Switch, 5ТАКТ(ОМ), 


The operation of the LSU is described by the sequence chart in Fig. 8.27. As 
shown, the LSU sequence begins whenever register V is set to 1. If register 1 is O, it 
indicates a data transfer operation, and the contents of register LB is circularly 
shifted to the left 16 bit-positions. If register I is 1, it indicates a read or write opera- 
tion, depending on whether register RW is 0 or 1, respectively. If register RW is 1, 
the data word in register LB is stored into memory L; otherwise, a data word is read 
out of the memory L and into register LB. In any of the three cases above, the se- 
quence is terminated by resetting register V to 0. 

The operation of the channel is described by the sequence chart in Fig. 8.28. 
It is similar to the operation of the IOU described by the sequence chart in Fig. 8.21. 
The major difference lies in that the micro-operations of accessing the large storage 
memory L and of circular leftshift operation are now carried out in the LSU and 
controlled by the CH by means of register V and I. Furthermore, data transfers are 
needed between the LSU and the CH; these are initiated by the CH. 

As shown in Fig. 8.26, there are 47 lines between the CH and the LSU. These 
lines are called the 7/O interface since the I/O devices are attached to these lines. For 
the computer system shown in Fig. 8.24, the interface between the MU and the CH 
remains the same as that between the MU and the IOU in Fig. 8.23. The interface 
between the CPU and the CH remains the same as that between the CPU and the 
IOU. 

The computer system shown in Fig. 8.25 has three LSU's attached to the chan- 
nel. It now requires that a field be provided in the I/O instruction to specify the LSU 
(i.e., an I/O device address). The channel can be implemented to execute commands 
other than the read and write operations. The command code, the I/O device address, 
the main memory address where the data are accessed, and the other necessary infor- 
mation can be formatted into a channel command word for execution by the channel. 
It may also be possible for the channel to execute a channel program which is a 
sequence of channel command words. The channel program can be stored in the 
main memory or in the memory which is a part of the channel. Thus, the channel- 


386 Chap. 8 CONTROL ORGANIZATION 


START(ON) 


Waiting loop 
V 


LB<16 cil LB 


ГВ (МАВ) L(MAR)<LB 


Fig. 8.27 Sequence chart for the large storage unit 


can grow and become a processing unit to perform some I/O functions which other- 
wise would be carried out by the CPU. The channel performs these functions con- 


currently with the CPU; this concurrent operation greatly improves utilization of the 
CPU. 


8.5.2 Control Unit 


The input-output unit such as the LSU in Fig. 8.24 is commonly built in two 
units, the control unit or CU and the input-output device or IOD, because the IOD 
is usually electro-mechanical, while the CU is electronic. A computer system with 
separate CU and IOD is shown in Fig. 8.29. Separation into CU and IOD makes it 
possible to have one CU to control one or more identical or similar IOD's. А com- 


START(ON) 


Input sequence 


YB<0, 
C<countdn С, 
LB(0-15)-B(BUF) 


YB-O, 
C«countdn C, 
B(BUF)-LB(0-15) 


Fig. 8.28 Sequence chart for the channel 


387 


388 Chap. 8 CONTROL ORGANIZATION 


MU 


CPU CH CU 


Fig. 8.29 А computer system with a channel, a control unit, and 
an input-output device 


puter system with one CU and three IOD’s is shown in Fig. 8.30. This system, however, 
allows only one IOD to be operated at any one time. 


Fig. 8.30 A computer system with a channel, a control unit, and 
three input-output devices 


Extending the idea of connecting several IOD's to one control unit, several 
control units can be connected to one channel. A computer system with this configu- 
ration is shown in Fig. 8.31. The need for using separate control units for the IOD's 
is often due to the different types of IOD's which require different types of control 
units. The configuration in Fig. 8.31, however, allows only one CU and one IOD to 
communicate with the MU of the system. To have more than one IOD's to communi- 
cate with the MU, more than one channel is required. А computer system with two 
channels is shown in Fig. 8.32, where control units CUI and CU2 are connected to 
channel A, and control units СОЗ and CU4 are connected to channel B. In this 
system, two IOD's can communicate with the MU at the same time. 

The channel described above handles one IOD at one time. There are channels 
which can be operated in a time-sharing mode to handle many low-speed IOD's. 


“ 


Sec. 8.5 Computer System Configuration 389 


Fig. 8.31 А computer system with three control units, each 
having one input-output device 


Fig. 8.32 A computer system with channels A and B. Each 
channel has two control units, each having one input- 
output device 


Such a time-sharing mode of operation is often called multiplexing, and such a channel 
is called a multiplexor channel, as will be further described in chapters 9 and 10. 


8.5.3 Modular Configuration 


The connections of the channels, control units, and input-output devices of the 
computer system shown in Fig. 8.32 have a tree-like structure. Each IOD can be 
accessed by the CPU or MU, but this access has a definite path and must pass a par- 


390 Chap. 8 CONTROL ORGANIZATION 


ticular CH and a CU to reach the IOD. If the CH or the CU is busy (or failed), other 
IOD’s connected to this CU or to this CH are not accessable until the path becomes 
free (or repaired). 

The accessibility of the IOD's by the CPU and MU сап be improved by using а 
crossbar switch. A computer system having a 2 by 4 crossbar switch connecting the 
CU's and the IOD's is shown in Fig. 8.33, where either of the two CU's can access 


Crossbar 


switch 


Fig. 8.33 A computer with one crossbar switch. Each CU can 
access any of the four IOD's 


any of the four IOD's. A computer system having a 2 by 4 crossbar switch and a 2 
by 2 crossbar switch is shown in Fig. 8.34. In this system, the 2 by 4 crossbar switch 
connects the CU's and the IOD’s, while the 2 by 2 crossbar switch connects the CH's 


Crossbar Crossbar 


switch switch 


Fig. 8.34 A computer system with two crossbar switches. Each 
channel can access to апу of the four IOD's 


бес. 8.5 Computer System Configuration 391 


CPU1 CPU2 


MUT 


MU2 


Crossbar switch 


МУЗ 


MU4 


CU2 OD2 


switch CU3 OD3 


CUN 


Crossbar 
| 
[ 
! 
Ц 
і 
! 
і 
! 
| 


Fig. 8.35 А computer system with two crossbar switches 


and the CU’s. Any of the four IOD’s can be accessed by either channel, and two 


IOD’s can be operated at the same time. 
Another computer system with two crossbar switches is shown in Fig. 8.35. The 


4 by 4 crossbar switch connects the two CPU’s and the two CH’s to the four MU’s; 


392 Chap. 8 CONTROL ORGANIZATION 


this switch enables either CPU or either CH to access any of the four MU’s. The 
2 by n crossbar switch makes connection between the CH’s and the n CU’s; this 
switch makes possible either CH to access any of the n IOD’s. Failure of any of the 
MU’s, CPU’s, CH’s, CU’s, and IOD’s may not cripple the operation of the computer 
system, but it may only degrade its performance. Because of modularity of the system 
units of the computer system. in Fig. 8.35, the computer system is said to have a 
modular configuration. 


Fig. 8.36 A state diagram 


References 393 


References! 


. MERSEL, J., "Program Interrupt on the Univac Scientific Computer," Proc. of the WJCC, 


1956, pp. 52-53. 


. Ввоок$, F. P., JR., “А Program-Controlled Program Interruption System," Proc. of 


the Easter Computer Conference, 1957, pp. 128-132. 


. TURNER, Г. R., and RAWLINGs, J. H., “Realization of Randomly Timed Computer 


Input and Output by Means of an Interrupt Feature," IRE Trans. on Elec. Comp., June, 
1958, pp. 141-149. 


. BECKMAN, F. S., BRooks, F. P., JR., and LAWLEss, W. J., JR., “Developments іп the 


Logical Organization of Computer Arithmetic and Control Units," Proq. of the IRE, 
January, 1961, pp. 53-66. 


. BUCHHOLZ, W., ed., Planning a Computer System. New York: McGraw Hill Book 


Company, 1962. 


. GRASSELLI, A., “The Design of Program-modifiable Microprogrammed Control Units," 


IRE Trans. on Elec. Comp., June 1962, pp. 336-339. 


. BLAAUV, G. A., “Multisystem Organization,” part V of “The Structure of System/360," 


IBM Systems Journal 3, Nos. 2 and 3, 1964, pp. 181-195. 


. McGee, W. C. and PETERSEN, Н. E., “Microprogram Control for the Experimental 


Sciences," Proc. of the FJCC, 1965, New York: Spartan Books, pp. 77-91. 


. CAMPBELL, С. R. and NEILSON, D. A., “Microprogramming the Spectra 70/35," Datama- 


tion, September, 1966, pp. 64-67. 


. Gountals, В. J. and Viss, М. J., “A Method of Processor Selection for Interrupt 


Handling in a Multiprocessor System," Proc. of the IEEE, December 1966, pp. 1812-1819. 


. ELonzs, I., Computer Organization. Englewood Clffs, N.J.: Prentice-Hall, Inc., 1969. 


. VANDLING, С. C. and WALDECKER, D. E., “Тһе Microprogram Control Technique for 


Digital Logic Design," Computer Design, August, 1969, pp. 44-51. 


. RAMAMOORTHY, C. V. and ТзоснихА, M. “A Study of User-microprogrammable Com- 


puters," Proc. of the SJCC, 1970, pp. 165-181. 


TReferences on microprogramming in Chapter 3 are also references in this chapter. 


394 


Chap. 8 CONTROL ORGANIZATION 


Problems 


8.1. 


8.2. 


8.3. 


8.4. 


8.5. 


8.6. 


8.7. 


8.8. 


8.9. 


8.10. 


8.11. 


8.12. 
8.13. 


8.14. 


8.15. 


8.17. 


Draw diagrams showing the control organizations described by statements (8.1), 
(8.3), and (8.5) and comment on their differences. 


Draw diagrams showing the control organizations described by statements (8.2), 
(8.4), and (8.6) and comment on their differences. 


A state diagram is given in Fig. 8.36 where each circled number represents a state. 
Conceive a sequential-logic control organization which sequences according to the 
states of the diagram. 


From the microprogram described in statements (8.18)-(8.21), prepare the three control 
words іп 32 bits of O's and 125. 


Revise the organization and the microprogram of the microprogrammed CPU if the 
binary numbers to be compared are in the signed magnitude representation. 


Prepare a microprogram for multiplying the binary number in register А by the binary 
number in register B, assuming that these numbers are in signed magnitude representa- 
tion. Additional computer elements and operations, if needed, may be chosen. 


Modify the sequencing description in statements (8.25) for a branch upon the condition 
that the number in the accumulator is smaller than the number in the MQ register. 


Modify the organization described in statements (8.27)-(8.30) to allow a repeat instruc- 
tion which repeats execution of the instruction for a number of times specified in a 
count field. 


Draw a state diagram from the sequence chart in Fig. 8.20. 
Repeat Problem 8.9 for the sequence chart in Fig. 8.21. 

Repeat Problem 8.9 for the sequence chart in Fig. 8.22. 
Complete the sequence chart in Fig. 8.22 for all other instructions. 


Assume that the same clock P is available to the CPU, IOU, and MU. If main memory 
M requires four clock periods to complete its memory cycle and if large-storage 
memory L requires 10 clock periods, revise the statement descriptions (8.50), (8.52), 
and (8.54). 


In making the request to read or write to memory by either the CPU or the IOU in 
Figs. 8.20-8.22, is there a queue? If there is, how can one implement a longer queue? 


Describe in the CDL sequential operation of the Large Storage Unit as described by 
statements (8.59) and the sequence chart in Fig. 8.27. The description should include 
the timing and control signals. 


. Repeat Problem 8.15 for the channel described by statements (8.58) and the sequence 


chart in Fig. 8.28. 


The data between the LSU and the CH of the computer system as shown in Fig. 8.24 
are transferred at a width of 16 bits. If the data are transferred at a width of 4 bits, 


M 


Problems 395 


8.18. 


(a) revise the configurations of the LSU and CH described in the block diagrams in 
Fig. 8.26 and the statements (8.58) and (8.59), 

(b) revise the sequences described in the charts in Figs. 8.27 and 8.28, and 

(c) revise the interface between the LSU and the CH as shown in the diagram in Fig. 
8.26. 


Describe in the CDL the configurations and the sequences of the channel and the 
LSU’s of the computer system shown in Fig. 8.25. Assume that the I/O instruction has 
a field which specifies the particular LSU, the data transfer between the MU and the 
CH is at a width of 16 bits, and the data transfer between the LSU and the CH is at a 
width of 4 bits. 


Computer organization describes formats, codes, functional units, configuration, 
and algorithms of a digital computer system. This chapter introduces systems 
units of a modern digital computer system and then describes a particular computer. 
The channels of this computer will be described in Chapter 10. 


Computer Organization 9 


9.1 System Units 


A modern digital computer system is composed of many units: main memory 
(or main storage), central processing unit, channels, I/O interface, I/O control units, 
and I/O devices, in addition to an operating system. These system units of the digital 
computer system are shown in Fig. 9.1. 


9.1.1 Main Memory 


A stored program computer is operated by sequentially executing the instruc- 
tions of a program and data which are stored in the main memory. There are two 
types of programs: the system’s programs and the user’s programs. The system’s 
programs consist of an operating system, language processors, utility programs, and 
the like. They are commonly referred to as software. The main memory is a random 
access memory which constitutes the most important difference between a stored- 
program computer and the other types of digital computers. 

Many problems arise in using a modern general-purpose digital computer, 
because the main memory does not have the capacity to hold the system’s and the 
user’s programs. The capacity of the main memory of a digital computer has been 
increased from a few thousand words in the very early digital computers to hundreds 
of thousands of words or more in a modern large-scale computer. Despite this tre- 
mendous progress in memory technology, the main memory capacity is still not large 
enough in some cases and is too costly in others. As a result, magnetic tape, drum, 
or disc memories which are slower but much larger in capacity are invariably added. 
The use of these various types of memories with great differences in data rate, data 
format, and access method attributes to the complexity in hardware and software 
of a modern digital computer system. 


9.1.2 Central Processing Unit 


The central processing unit (CPU) consists of the processing unit (sometimes 
called the arithmetic and logical unit) and the main control unit. The main control 
unit performs the sequencing of micro-operations in an instruction, the sequencing 
of instructions in a program, and, in some computers, the switching from one pro- 
gram to another. It also allows, in most computers, the interruption of the sequencing 
by an internal or external signal. In executing an instruction, the main control unit 


398 


Sec. 9.1 System Units 


Main 
storage 


Central 


processing 
unit 


1/0 
interfaces 


Multiplexor 
channel 


Selector 
channel 


Selector 
channel 


399 


МО КО devices 


control E 
units 


Fig. 9.1 System units of a computer system 


fetches the instruction from the main memory, interprets it, and then executes a 
sequence of micro-operations. 
The operations required in an instruction are carried out in the processing unit. 
These can be binary as well as decimal arithmetic operations, logical operations, and 
operations for comparison, translation, conversion, editing, and bit testing. The 
data to be processed by these operations can be of fixed length, of variable length, 
or both. A large-scale computer should have a powerful and comprehensive set of 
instructions to perform various functions with minimum need for memory space. 


400 Сһар. 9 COMPUTER ORGANIZATION 


9.1.3 Channels 


Channels are units which allow concurrent operations of data processing in the 
CPU and the I/O data transfers with I/O devices. There can be one or more I/O 
control units connected to each channel. The cable that connects the channel and 
the I/O control units is called the 1/О interface, as indicated in Fig. 9.1. Since one ог 
more I/O devices are connected to each I/O control unit, it is possible to connect 
many devices to one channel. 

There are two types of channels: the selector channel and the multiplexor 
channel. The selector channel allows the I/O data transfer with a high-speed 1/O 
device such as a magnetic drum storage. It is thus named because only one I/O device 
can be selected on the channel at any one time. Once the channel is selected, the 
operation is not interrupted until it is completed. The multiplexor channel allows 
concurrent I/O data transfers with many low-speed I/O devices such as the card 
reader and the line printer. This is accomplished by multiplexing (1.е., time-sharing) 
the channel and the I/O interface. 


9.1.4 І/О Devices and Control Units 


External storage devices as well as the devices which the computer uses to com- 
municate with the outside world are referred to as //O devices. A typical I/O device 
requires some unique control equipment for its operation; this control equipment is 
referred to as the //О control unit. The control unit of an I/O device is commonly 
built as a separate unit or as an integral part of the I/O device. Some control units 
can control only one device, while others can control a number of similar I/O devices. 


9.1.5 Operating System 


A modern general-purpose computer system is complete with an operating 
system. An operating system is a set of programs that provide for the preparation 
and execution of the user’s programs. It is designed and implemented for the purpose 
of improving the productivity of both the user and the computer system. 

The programs (or routines) of an operating system may be divided into two 
catagories: processing programs and control programs. Processing programs are 
those which assist the user to use the computer system and to reduce his programming 
effort. Examples of processing programs are assemblers, compilers, report generators, 
sort programs, and utility routines such as card-to-tape programs. Control programs 
are those which augment the computer to monitor, control, and operate the computer 
system so that a continuous flow of jobs is processed with minimum intervention by 
the operator. Examples of control programs are monitor, loader, input-output con- 
trol system, scheduler, interrupt routine, and diagnostic program. Both the control 
and processing programs, except a portion which is resident in the main memory 


“ 


Sec. 9.2 Formats and Codes 401 


unit, are in a system library stored in a mass storage. In order to be effective, the 
operating system should be designed at the same time the computer is designed. Only 
in this way can the operating system become an integral part of the computer system. 
System-oriented features such as memory protection, privileged instructions, pro- 
gram relocation, automatic interrupt, and interval timer are now implemented by 
hardware. 


9.2 Formats and Codes 


In order to describe in some depth the organizations of commercially available 
computers, the IBM System/360 family of computers (models 25, 30, 40, 44, 50, 65, 
67, 75, 85, and 91) is chosen, partly because of its wide usage in many applications 
and partly because of the unprecedented effort spent in the design and implementation 
of both hardware and software. We begin by describing the formats and codes of the 
family because they are common to all the members of the family. 


9.2.1 Data formats 


The basic unit of information which is physically implemented by hardware is 
the bit. The basic unit of information that can be addressed and processed is eight 
bits which are conceptually grouped together and called a byte. (There is an additional 
odd parity bit associated with each byte.) The capacity of the main memory is 
described by the number of bytes. The widths of data paths among the registers, as 
well as among the system units, are in terms of bytes. 

Data formats are shown in Fig. 9.2. Information can be either in a fixed-length 
format or a variable-length format. Fixed-length information can be a byte, a half- 
word, a word, or a double word. A halfword is a group of two consecutive bytes. A 
word is a group of four consecutive bytes. A double word is a group of eight consecu- 
tive bytes. As shown in Fig. 9.2, a fixed-point number is either a halfword or a word 
(or fullword), and a floating-point number is either a word or a double word. Logical 
data are of fixed-length and are in words. Fixed-length information stored in the 
main storage must be located at the proper address boundaries as follows. Bytes can 
be stored at any address, halfwords at addresses that are multiples of two, words at 
addresses that are multiples of four, and double words at addresses that are multiples 
of eight. These boundary alignments of formats are illustrated in Fig. 9.2. Within a 
fixed-length format, the bits making up the format are consecutively numbered from 
left to right, starting with zero. 

When the length of a field is stated explicitly, the information is said to have 
variable length. As shown in Fig. 9.2, a decimal number or a string of characters is 
of variable length. Variable length fields can be from one to 256 bytes in increments 
of one byte; no boundaries have to be observed. Decimal number in the packed 
decimal format can be from one to sixteen bytes. 


(uone10dj05 эчщоей| ззэшепя jeuoreui9juj 34} jo А5әнпо2) sjeuuoj eeg 26 Біз 


ЕЕ ы НЗ12УНУНӘЭ H319VuVHO 
8 8 8 


| SH319VHVHO 30 
LE 0 


viva 1V21901 


| QNVH3dO 1v231901 HL9N31-03XI4| 


поа шоа | 
v v 


| | чзаийм 1vWi23Q азмог) 


| , 


--- ------ ------------------------------------------------ Г | H38INQN 1VINIO3Q Q35l0 Vd 
|4 0 


| YAGWNN LNIOd-ONILVOTS A 


| n 0 
DILSIN ILOVY VHD 
4 


| YASWNN LNIOd—ONILVO14 LHOHS 
| | 0 
Le 
Hu38AnN LNIOd-daXI4 ч 
ol | 


Н 
H3O31NI 
SL 


YASWNN LNIOd-Q3XI3 GYHOMIIVH 


| | 

| | 

| | | 

| | | 

| | | | 
| 

— сыры кызый дыз | 31A8——5«4— sajane ae 


aYOMJIVH QHYOMI1VH QHOM4A VH QHYOMI1V7H 


QHOM 


аном 318no0qQ 


402 


Sec. 9.2 Formats and Codes 403 


9.2.2 Data Representations 


There are five types of data in the IBM System/360: fixed-point numbers, float- 
ing-point numbers, decimal numbers, logical operands, and character strings. Fixed- 
point numbers are binary integers in the signed 2’s complement representation. As 
shown in Fig. 9.2, fixed-point numbers can be a halfword or a fullword with the sign 
bit at the leftmost bit position. 

Floating-point numbers are hexadecimal numbers which employ 16 digit symbols 
shown in Table 9.1. A floating-point number can be a word or a double word. As 


TABLE 9.1 Hexadecimal Code 


BINARY HEXIDECIMAL 


0000 
0001 
0010 
0011 
0100 
0101 
0110 
0111 
1000 
1001 
1010 
1011 
1100 
1101 
1110 
1111 


MMOAWPOMAIDNWNUAWNHKH о 


shown in Fig. 9.2, the leftmost bit is the sign bit, the next seven bits are the exponent 
part, and the remaining are the fraction bits which are either six or fourteen 
hexadecimal digits (24 or 56 bits). The fraction is in the signed magnitude represen- 
tation. The exponent is a 7-bit binary number whose base is 16 and is biased by 
adding decimal 64 so that the resulting exponent is always an integer. The magnitudes 
of decimal numbers, which can be expressed in the floating-point format, ranges 
from 16-64 to 16*5? which is approximately 10778 to 10775, A floating point number 
having a fraction of zero is automatically assigned an exponent of zero. 

Decimal numbers are coded with 4-bit binary numbers as decimal digits with 
two decimal digits in each byte. This format is referred to as packed decimal format. 
There is another one called zoned decimal format which will be described subse- 
quently. Both formats are shown in Fig. 9.2. Decimal numbers in the packed decimal 
format are of variable length from one to 16 bytes. They are in the signed integer 
representation with the sign in the least-significant bits. Only the binary numbers 
0000 to 1001 (decimal 0-9) are valid decimal-digit codes. Codes 1010 to 1111 are used 


404 Сһар. 9 COMPUTER ORGANIZATION 


to represent the sign. The interpretation of the 4-bit sign is shown in Table 9.2 where 
ASCII-8 and EBCDIC are codes to be described subsequently. 


TABLE 9.2 Sign Representations of Packed 
Decimal Numbers 


os 


BINARY NUMBER INTERPRETATION 
1010 +sign in ASCII-8 
1011 —sign in ASCII-8 
1100 +sign in EDCDIC 
1101 —sign in EBCDIC 
1110 +sign for any code 
1111 +sign for any code 


Fixed-length logical operands consist of one, four, or eight bytes. A 4-byte fixed- 
length logical operand is shown in Fig. 9.2. A string of characters consist of a string 
of one to 256 bytes. 


9.2.3 Data codes 
One of the most familiar data codes has been the binary coded decimal code 


(BCD). The BCD code employs six bits to represent 64 alphameric and special char- 
acters as shown in Table 9.3. In the table, the first column shows the characters, and 


TABLE 9.3 Internal and External BCD Codes 


CHARACTER INTERNAL EXTERNAL 
BCD СорЕ BCD Сор 


шт Осо > о о-оо л шч кю к о 


Sec. 9.2 Formats and Codes 405 


TABLE 9.3 (Continued) 


F 26 66 
G 27 67 
H 30 70 
I 31 71 
J 41 41 
к 42 42 
L 43 43 
M 44 44 
N 45 45 
О 46 46 
Р 47 47 
Q 50 50 
R 51 51 
5 62 22 
T 63 23 
U 64 24 
V 65 25 
W 66 26 
X 67 27 
Y 70 30 
Z 71 31 
+ 20 60 
— 40 40 
* 54 54 
/ 61 21 

60 20 
$ 53 53 

33 73 
) 34 74 
( 74 34 
= 13 13 
/ 73 33 


the third column shows the BCD code called the external BCD code. The second 
column shows another BCD code called the internal BCD code which some com- 
puters such as the IBM 7090 family of computers use when the characters are stored 
in the memory. The widespread use of the BCD code is partially due to the simple 
translation from the IBM card code. 

The 6-bit code cannot conveniently represent more than 64 different characters. 
An 8-bit code allows up to 256 characters or two binary coded decimal digits. A 
BCD code extended from six bits to eight bits was chosen for the IBM System 360 
computers. This extended version called the extended binary coded decimal interchange 
code (EBCDIC) is shown in Fig. 9.3(a). Note the numbering of bit positions in Fig. 
9.3(а). The selection of eight bits as a byte is largely influenced by the adoption of 
the 8-bit code. The conversion from the Hollerith code on a punched card to the 
EBCDIC code is shown in Table 9.4. 


406 Chap.9 COMPUTER ORGANIZATION 


Bit positions ——————~ 01 


00 01 10 11 
23 
4567 00 01 10 11 00 01 10 11 00 01 10 11 00 01 10 11 
= BAES 
өз ШЕН 
өн ЄЛЄТ 
өз ЕЛЕ 


ЕКЕШ ЕШ ВИ 
ЕЕЕ ЕЕ ЕЕ ЕЕ ЕКИ 
ШЕБИНЕ НЕ ЫЕ Н ЫН 
ШЕЕ ИШЕ ЕЕЕ Е 
В ЕЕ 


PF Punch off BS Backspace SM Set mode 
T Horizontal tab IL Idle PN Punch on 
tC Lower case BYP Bypass RS Reader stop 
DEL Delete LF Line feed UC Upper case 
RES Restore EOB End of block EOT End of transmission 
NL New line PRE Prefix SP Space 


esso PD PTD] 


Fig. 9.3(a) The extended binary coded decimal interchange 
code (EBCDIC) (Courtesy of the International 
Business Machine Corporation) 


Another character code which was also chosen for the IBM System/360 com- 
puters is the 7-bit character code of the American standard code’s information inter- 
change (АСЗП). This code was extended into an 8-bit code (ASCII-8) (Fig. 9.3(b)). 

When a decimal number is represented by the EBDCIC, it is called the zoned 
decimal format (Fig. 9.2). In this format, the four leading bits (called the zone portion) 
are all 1’s. These leading 1’s are chosen so as to form a collating sequence in which 
numbers are made higher than alphabetics in alphameric fields. The zoned format is 
used primarily for character sensitive I/O devices. Decimal arithmetic operations 


Sec. 9.2 Formats and Codes 407 
Bit positions -------- 7 


X5 
4321 00 01 


stop 


0110 SYNC 


КАКЕ 


ШЕЛЕЗ 


9 
o 
= 
= 
> 
с 


ERES 
E 
EXC 


NULL Null/Idle CR Carriage return 

SOM Start of message SO Shift out 

EOA End of address SI Shift in 

EOM End of message DCO Device control 
Reserved for Data 
Link escape 

EOT End of transmission DC1-DC3 Device control 

WRU "Who are you?” ERR Error 

RU "Are you...?” SYNC Synchronous idle 

BELL Audible signal LEM Logical end of media 

BKSP Backspace 50-57 Separator (information) 

HT Horizontal tabulation b Word separator (blank, 

LF Line feed normally non-printing) 

VT Vertical tabulation ESC Escape 

FF Form feed DEL Delete/Idle 


sone EAI ES EERE END 


Fig. 9.3(b) The 8-bit extension of the American standard code 
for information interchange (ASCII-8) (Courtesy of 
the International Business Machine Corporation) 


are performed on decimal numbers in the packed decimal format (not in the zoned 
decimal format). An instruction is available to convert the decimal number from the 
zoned format to the packed format. Another instruction is available to convert 
decimal numbers in the packed decimal format into binary numbers for binary 


arithmetic operations. 


408 


TABLE 9.4 Hollerith Card Code and EBCDIC Code 


CARD 
CODE 


12,8,3` 


12,8,4 
12,8,5 
12,8,6 
12 


11,8,3 
11,8,4 
11,8,5 
11 


0,1 
0,8,3 
0,8,4 


8,3 
8,4 
8,5 
8,6 


12,1 
12,2 
12,3 
12,4 
12,5 
12,6 
12,7 
12,8 
12,9 


11,1 
11,2 
11,3 
11,4 
11,5 
11,6 
11,7 
11,8 
11,9 


0,2 
0,3 
0,4 
0,5 
0,6 
0,7 
0,8 
0,9 


PRINTER 
CHARACTER 


blank 
. (period) 


NK KXE<CHY gonvozzcocm-7"mommggaoauw» 


Chap. 9 COMPUTER ORGANIZATION 


EBCDIC 
CODE 


01000000 
01001011 
01001100 
01001101 
01001110 
01010000 


01011011 
01011100 
01011101 
01100000 


01100001 
01101011 
01101100 


01111011 
01111100 
01111101 
01111110 


11000001 
11000010 
11000011 
11000100 
11000101 
11000110 
11000111 
11001000 
11001001 


11010001 
11010010 
11010011 
11010100 
11010101 
11010110 
11010111 
11011000 
11011001 


11100010 
11100011 
11100100 
11100101 
11100110 
11100111 
11101000 
11101001 


Sec. 92 Formats and Codes 409 


TABLE 9.4 (Continued) 


11110000 
11110001 
11110010 
11110011 
11110100 
11110101 
11110110 
11110111 
11111000 
11111001 


\© 00-10 tA PWN = © 
OPmAAINANA PWN о 


9.2.4 Instruction formats 


There are five basic instruction formats as shown in Fig. 9.4. These instruction 
formats contain one, two, or three halfwords. All instructions must be aligned in 
halfword boundaries in the memory. 

The five basic instruction formats are denoted by RR, RX, RS, SI, and SS. The 
first 8-bit field of the instruction contains the operation code (op-code). The op-code 
is coded as follows. Bits 0 and 1 specify instruction length as shown in Table 9.5 and 


TABLE 9.5 Instruction-Length Code 


Two LEFTMOST 


FORMAT BITS OF OP-CODE NUMBER OF HALFWORDS 
RR 00 1 
RX 01 2 
RS or SI 10 2 
SS 11 3 


they also specify, indirectly, data location (main memory or registers). Bits 2 and 3 
specify the type of data (fixed or variable length, decimal, binary, or floating-point). 
Bits 4-7 specify what to do with the data. The other fields of the instruction specify 
operands. 

There are four types of operands: 


1. Register operand 
2. Storage operand 
3. Immediate operand 
4. Block operand 


410 Сһар. 9 COMPUTER ORGANIZATION 


First halfword Second halfword Third halfword 
| Register | | 
| орегапсі5 
| | | 

1 2 1 | 

| rma, TTT, | 

на format [ОР соге [в | e | 
10 78 11.12 15 | 

| Register | Storage | 

| орегапа | орегапа | 

| 1 | 2 | 

| sy 
та orco | A x] elo 
10 78 1112 15|16 1920 31 

| Register | Storage | 

| орегапаѕ | орегапа | 

| 1 3 ! 2 | 


Ee cee, 
wem orc fala] eo] 9 — 


10 78 1112 1516 19 20 31i 
| Immediate | Storage | 
| operand | орегапа | 
| 1 | 2 | 
SI format 
| 
| Operand ! Storage | Storage 
lengths | орегапа | орегапа 
| | 
| 1 2 | 1 | 2 
p ————————————————— — 
stems ‘fel вв о 
0 78 1112 1516 1920 31 47 


Fig. 9.4 Instruction formats (Courtesy of the International Busi- 
ness Machine Corporation) 


The register operand is stored in one of the 16 general-purpose registers. These are 
32-bit registers, to be called here simply registers. Their 4-bit address is denoted by 
R as shown in Fig. 9.4. Format RR denotes an operation on two register operands 
with the result replacing the first operand. 

The storage operand is stored in a location of the main storage. The storage 
address is 24 bits. It is obtained by summing the 24-bit base address stored in one 
register addressed by B, the 24-bit index address stored in another register addressed 
by X, and the displacement which is the 12-bit D field of the instruction itself. The 
sum is Called the effective address. In forming an effective address, the base and index 
are treated as an unsigned 24-bit and the displacement as a 12-bit positive binary 
integers with overflow ignored. If either the base or the index is zero, then zero 
itself must be used in forming the effective address. Format RX denotes an operation 
on one operand in the register and another in the main storage. 

Block operands refer to a group of operands; they can be in the general-purpose 
registers. Format RS denotes an operation on the block operands in a set of general- 
purpose registers starting with the register specified by R, and ending with the register 


Sec. 9.3 A Computer Organization 411 


specified by К,. The result is stored in a block in the memory words whose starting 
address is obtained by summing the base address specified by B, and displacement 
D,. This format may also denote an operation on the block operands in the memory 
with the results stored in a set of general-purpose registers. 

The immediate operand is the 8-bit I field itself. This I field is a character that 
can be used immediately. Format SI denotes an operation on the immediate operand 
with the result replacing the memory location in the memory whose address is obtain- 
ed by summing the base address located by B and displacement D. 

Block operands can also be in the main memory. They are addressed by a 
memory address and an operand length in number of bytes. Format SS denotes the 
operation on two block operands of variable length in the main memory. The results 
of the operation replace the first block operands. Each of the two memory addresses 
is obtained by summing the base address located by B and displacement D, and each 
of the operand lengths is specified by the L field of the instruction. 

The use of base-displacement addressing has two advantages. The first advantage 
is to allow the reduction of instruction bits for addressing; this economy in bits 
requires the use of general-purpose registers for addressing storage. The second 
advantage is to permit easier program relocation because the base addresses can be 
specified at the load time. 


9.3 A Computer Organization 


In order to describe the organization of the IBM System/360 family of computers 
to some depth within the scope of a chapter, the description is limited to model 40. 
Though this is the medium-small member of the family, the concepts and configu- 
rations are generally applicable to other members of the family. 

An IBM System/360 model 40 computer system is shown in the block diagram 
in Fig. 9.5. It is a microprogrammed computer. There are a main storage, a CPU, 
one multiplexor channel, and two selector channels; the latter are optional. The 
channels are connected to the I/O devices through an I/O interface which is standard- 
ized for the family. To the multiplexor channel are connected a console typewriter 
and, via an integrated control unit, a printer and a card reader/punch. Three mag- 
netic tape drives are connected to selector channel 1 via magnetic tape control unit 
A and two disk drives via file control unit A. Two tape drives are connected to selector 
channel 2 via magnetic tape control unit B and two disk drives via file control unit B. 


9.3.1 Data Flow 


Data flow is the configuration of memories, registers, data paths, and other 
elements which are organized to carry out the flow of data in performing various 
micro-operations and sequences. The data can be addresses, numbers, codes, flags, 
counts, commands, indicators, status words, control words, time, instructions, strings 
of characters, and the like. The contents of the data flow represent the status of the 


412 Chap. 9 COMPUTER ORGANIZATION 


Multiplexor 
. channel 
Main 
storage Integrated 
control unit 
Permanent 


locations 


Console Card : Printer 
typewriter read/punch 
Channel 


rograms 
prog ИШ Selector 
channel 1 
storage Tape control File control 
unit A unitA 


Tape Tape Tape Disk Disk 
drive drive drive drive drive 
0 1 2 0 1 
Central 
processing Selector 
unit channel 2 


Tape control File control 
unit B unit B 
Tape Tape Disk Disk 
drive drive drive drive 
0 1 0 1 


Fig. 9.5 An IBM System/360 model 40 computer system 


CPU and memories. A data flow of model 40 is shown in the block diagram in Fig. 
9.6. As shown, major data paths are the bus connected to register R, the two input 
buses to the arithmetic and logical unit (ALU), the output bus from the ALU, and 
the interfaces for communicating with the I/O devices. 


9.3.2 Storages 


There are four storages: main storage, multiplex storage, local storage, and 
read-only storage. The main storage stores program, data, and some control words. 


Ot !әрош ogg/ureisAS Wg! әш 10; MOY езер рәшіішіс 96 "б 


sjeubis 
1043409 


эроэәд| 22.  — есе Е 


әбе 2025 > auueyd 
aBbesoys | © | 
Е Ajuo еэо 3 

1250] Ф 


| 

| 

| 

| 

5 КТ joxejdüjni | 
| Jeuueu» | peay | 
| | 


10359195 
о ЕКТІ хал 
І 
| uvosou зан] | YVON = згршәшәзәц| IH әзерәуші 
| 
| 
| 


Е Er ЛЕ “үні | 


Bi (ae TY BO B Ie 
- әзерәзц 

І 

a МЫ | | 

| П | 

Ро | 

| | 

| ( | 

| 

| 

| 


Ц апе 


| Іәчиецэ 10293]3S 


$1216 


әбе0]5 хаш 


ssaJppv 


abe101s шел avs 


413 


414 Chap.9 COMPUTER ORGANIZATION 


The multiplex storage stores the control words of the multiplex channels. The local 
storage provides general-purpose registers and temporary storage for channel opera- 
tions. The read-only storage is the memory where the microprogram is stored. The 
characteristics of these storages are shown in Table 9.6. These storages are described 
below. 


a 


TABLE 9.6 Characteristics of Storages of IBM System 360 Model 40 


MAIN MPX LOCAL READ-ONLY 
CHARACTERISTICS STORAGE STORAGE STORAGE STORAGE 
Capacityt 16K to 256K bytes 128 to 1,024 words 144 words 4K, 6K or 8K 
Cycle time 2.5 microseconds 2.5 microseconds 1.25 microseconds 625 nanoseconds 
Word length 2 bytes or 16 bits 2 bytes or 16 bits 22 bits 56 bits 
Address length 14 to 18 bits 7 to 10 bitst 8 bits 12 or 13 bits 


TK represents a multiplier of 1024 
iIn conjunction with stat Y1 


9.3.2.1 Main storage 


The main storage is a random-access, magnetic-core memory with a cycle time 
of 2.5 microseconds. The main storage access for a read or a write operation is com- 
pletely independent; this is known as a split read-write cycle. In either read or a write 
cycle, two bytes (18 bits including two parity bits) are transferred simultaneously. 
It takes 1.4 or 1.1 microseconds for a read or a write cycle, respectively. Split-cycle 
operation with a write cycle is always preceded by a read cycle. The main storage has 
a capacity of 16K, 32K, 64K, 128K, or 256K bytes (where IK bytes means 1,024 bytes) 
and is addressed by 14-18 bits, respectively. The main storage of maximum size is 
organized by two storage units, each unit having 131,072 bytes and containing two 
arrays (each array stores a maximum of 64K bytes). The memory has register D as 
the buffer register and is addressed by the address on storage address bus (SAB). 
The SAB is connected to register À when the processing unit (or the multiplexor 
channel) is in operation, to register S1 when selector channel 1 is in operation, or to 
register S2 when selector channel 2 is in operation. The main storage stores data 
from the CPU or from the I/O units and delivers data to the CPU or the I/O units. 
Since the address for a main storage of 256K bytes requires 18 bits, the address register 
has to be larger than two bytes; the extra three bits which provide for this and other 
purposes are called extender bits. 

Assuming a main storage with a capacity of 131,072 bytes, one can describe 
the main storage MS and its associated registers and bus as follows: 


Comment, description of the main storage (9.1) 
Register, А(0-21), $MS address register 
D(0-17), $MS buffer register 


Sec. 9.3 А Computer Organization 415 


Bus, 5АВ(0-21), Sstorage address bus 
Memory, MS(SAB)=MS(0-131071,0-17), 


9.3.2.2 Multiplex storage 


Multiplex storage (MPX) comprises additional locations of the main storage. 
(It has separate Y drive lines and address bus; otherwise it shares with main storage 
the control, timing, and other circuits.) It has a capacity ranging from 256 to 2,048 
bytes with the choice of the size related to the main-storage size. It is addressed by 
multiplex address bus MPXAB. Assuming a multiplex storage with a capacity of 
2,048 bytes, one can describe it in the following manner: 


Comment, description of the multiplex storage (9.2) 
Bus, МРХАВ(15-13,6-0), 
Memory, MPX(MPXAB)=MPX(0-1023,0-17), 


The access to multiplex storage is accomplished by setting a special control bit Y1 
called mpx stat. This will be described later. This access is available only to the 
microprogram. The multiplex storage contains unit control words and other control 
information necessary to sustain the operations of multiplexor subchannels. 


9.3.2.3 Local storage 


Local storage (LS) is a high-speed magnetic-core memory. It has a capacity of 
144 22-bit words. The memory is operated in split cycle. А read-or-write complete 
cycle takes 1.25 microseconds. The access time is 350 nanoseconds. Register R is 
the buffer register and register LSAR is the address register. Registers H and J store 
LS addresses. In addition, there is an incrementer. These registers and the local 
storage are described below: 


Comment, description of the local storage (9.3) 
Register, LSAR(0-7),  $local storage address register 
H(0-7,P), $store LS address 
J(0—7,P), $store LS address 
R(0-21), $local storage buffer register 
Memory, LS(LSAR)-LS(0-143,0-21) 
Operator, X(4—7)—Y(4-7) increment К, 
The operator increment above declares the 4-bit incrementer where К сап be +1, 
0, —1, or —2. Register LSAR, the incrementer, together with registers H and J form 
the *address loop." By means of the address loop, the contents of register LSAR 


are routed through the incrementer to registers H or J so that registers H or J hold 
a LS address for subsequent use in register LSAR. 


416 Chap. 9 COMPUTER ORGANIZATION 


TABLE 9.7 Local Storage Map for Моде! 40 


LOCATION ALLOCATION 

000-008 Working area 

009-015 Working area and logout area 
016-031 `. Not used 

032-037 UCW for selector channel 1 
038-041 Multiplexor channel word area 
042 Interrupt buffer 

043-047 Unassigned 

048-053 UCW for selector channel 2 
054-056 Unassigned 

057-063 2nd level dump area 

064-066 Working area т undefined state 
067-071 Program status word 

072 Start I/O switch 

073-079 Ist level dump area 

080-095 Not used 

096-111 Buffer for selector channel 1 
112-127 Buffer for selector channel 2 
128-191 Not used 

192-207 Floating-point registers 
208-223 Not used 

224-255 Fixed-point registers 


A storage map for the local storage is shown in Table 9.7. As indicated, the 
LS provides among others the 16 fixed-point registers, four floating-point registers, 
channel buffers, unit-control-word (UCW) storage, and the first and second level 
dump areas. Only the fixed-point and floating-point registers are accessible to the 
machine language program. The use of the local storage to temporarily store the 
contents of the CPU registers allows the CPU to be time-shared for channel opera- 
tions, and thus eliminates the cost of fully implemented channels. 


9.3.2.4 Read-only storage 


The read-only storage (ROS) is a transformer read-only memory. It is made up 
of 16 modules, each containing 256 56-bit ROS words to give the storage a capacity 
of 4,096 words. The memory cycle time of the ROS and the clock cycle time of the CPU 
are both 0.625 microsecond. Thus, there are four 0.625-microsecond ROS cycles for 
each microsecond main-storage cycle. The address register ROAR addresses the 
ROS during CPU or multiplexor channel operation. Register ROSCARI addresses 
the ROS during selector-channel-1 operation, while register ROSCAR2 addresses 
the ROS during selector-channel-2 operation. The memory and its associated registers 
are described below. 


L3 


Comment, description of the read-only memory (9.4) 
Register, ROAR(0-11), $read-only address register 


Sec. 9.3 А Computer Organization 417 


ROSCARI(0-12,P), $read-only-storage channel-address register 
for selector channel 1 
ROSCAR2(0-12,P), Sread-only-storage channel-address register 
for selector channel 2 
Memory, ROS(ROAR)=ROS(0-255,0-55), 


9.3.3 Registers 


The registers in the configuration of model 40 are grouped into data registers, 
ALU registers, channel registers, ROS-address registers, storage-protect registers, 
and stats. (In the IBM terminology, a register may be referred to as a data bus if it 
is reset to 0 in every clock cycle.) Registers К, P, О, and others described below may 
also be referred to as data buses. 

9.3.3.1 Data registers 


There are five data registers: A, B, C, D, and R. Each of these registers consists 
of two 9-bit subregisters (eight data bits and one parity bit) and one 4-bit subregister 
(three data bits and one parity bit). These registers are described below. 


Comment, description of five data registers (9.5) 
Register, A(0-21), $М$ address register 

B(0-17), Stemporary storage 

C(0-21)  Stemporary storage 

D(0-17), $MS buffer storage 

R(0-21), $Бийег register for LS, data registers and display 
Subregister, А0(0-7,Р)-- A(0-8), 

A1(0-7,P) — A(9-17), 

AX(5-7,P)— A(18-21), 


The above subregister AX is called an extension register. Subregisters C and R are 
similarly defined. Subregisters from register B are defined below. 


Subregister, В0(0-7,Р)-- B(0-8), 
B1(0-7,P) = B(9-17), 


Subregisters from register D are similarly defined. 

There are many data paths among these registers (Fig. 9.6). Registers A, B, C, 
or D can be read in from either the R register or the ALU (arithmetic and logical 
unit), and the data in each of these registers can be transferred to register Q. The 
data in register A can be additionally transferred to register P and bus SAB, and 


418 Chap. 9 COMPUTER ORGANIZATION 


those in register B to register P. The data in register R can be transferred to register 
A, B, C, or D. 


9.3.3.2 ALU registers 


There are seven ALU registers. They are described below. 


Comment, description of seven ALU registers (9.6) 
Register, Р(0-7,Р), $operand register 
Q(0-7,P), $operand register 
CONTROL(0-3,P), $ALU function register 
FUNCTION(0-4,P), $ALU function register 


EXTENSION(0-2,P), $extension register 
SKEW-SELECT(0-7,P), S$skew-select register 
SKEW-BUFFER(0-3,P), S$skew-buffer register 


Registers P and Q serve as the ALU entries for the two operands. Register CONTROL 
determines the ALU function to be performed. Register FUNCTION stores the 
ALU function for indirect operations. Register EXTENSION is used as an entry 
buffer for one of the extended bits. The data in subregisters AO and AX (or CO and 
CX, or RO and RX) are transferred to register P or Q. Register SKEW-SELECT is 
also an ALU entry from register Q. When the data are transferred from register Q 
through register SKEW-BUFFER, a 4-bit leftshift can be ordered to occur. 


9.3.3.3 Channel registers 


There are two multiplexor channel registers; MPX-DATA and MPX-ERROR. 
The MPX-DATA register is the buffer register for outputing a data-byte to the 
I/O interface. The MPX-ERROR register consists of six single-bit subregisters which 
are set by various conditions detected during multiplexor channel operation. They 
are described below. 


Comment, multiplexor channel registers (9.7) 


Register MPX-DATA(0-7,P), Sinterface buffer register 
MPX-ERROR(0-5), $Ssix error indicators 


The selector channel registers will be described in a later chapter. 


9.3.3.4 Storage protect registers 


There are two storage protect registers, SP-DATA and SP-KEY (not shown in 
Fig. 9.6). Storage-protect data register SP-DATA holds the storage key whenever 
main storage is accessed, while storage-protect key register SP-K EY holds the protect 
key for the CPU and the multiplexor channel operattons. They are described below. 


Sec. 9.3 А Computer Organization 419 


Comment, storage protect registers (9.8) 
Register, SP-DATA(P,0-3), $storage protect data register 
SP-KEY(P,0-3)  $storage protect key register 


9.3.4 Stats 


Stats (IBM terminology) are an aggregate of single-bit registers provided for 
indicating machine status and test conditions. They are divided into two categories: 
Y stats and non-Y stats. There are nine non-Y stats as described below. 

Comment, description of non-Y stats (9.9) 

Register, HALT, $indicate the machine being in stop loop 

WAIT, $indicate the system being in the wait loop 
ENABLE, $when on, a machine error causes a logout or hardstop 
ACSII, $when on, the CPU operates in ASCII mode 


CPU, $indicate the CPU functioning as channel or not 
PSA, $indicate addressing of a protected storage location 
ISA, $indicate addressing of an invalid storage location 
YCD, $carry register of ALU 

YCI, $carry register of ALU 


In the above, /ogout (IBM terminology) means the transfer of the contents of the 
CPU and the channel-control registers to main storage starting at location 80 hex. 

There are 16 Y stats which form four casregisters called YA, YB, YD, and YE. 
These stats are described below. 


Comment, description of Y stats (9.10) 
Register, YO,  Sstorage stat 
ҮІ, $трх stat 
Y2, $store condition code 
ҮЗ, $store condition code 
Pl,  Sparity bit 
Casregister, ҮА(0-4)--Ү0-Ү1-Ү2-ҮЗ-РІ, 
Register, Y4,  Sgeneral purpose bit 


— 


Y5,  S$general purpose bit 
Y6,  Sgeneral purpose bit 
Y7,  Sgeneral purpose bit 
P2,  Sparity bit 


420 Chap. 9 COMPUTER ORGANIZATION 


Casregister, YB(0-4)— Y4-Y 5-Y6-Y 7-P2, 
Register, Y8,  Sinhibit dump stat (ID) 
Y9,  S$maskable interrupt (MI) 
Y10, S$manual stat 
Ү11, S$non-existent 
Casregister, YD(0-3)-Y8-Y9-Y11, 
Register, Y12, Serror stat 
Y13, S$non-existent 
Y14, Sintegrated zero test (IZT) 
Y15, Sload stat. 
Casregister, YE(O-3)=Y12-Y13-Y14-Y15, 
Casregister, Y(0—-3,P1,4-7,P2,8-15)= YA-Y B-YD-YE, 


In the above, storage stat YO, when set, allows checking for the ISA and PSA stats. 
Mpx stat Yl, when set, allows access to the mpx storage. State YO, ҮІ, Y2, Y3, and 
parity РІ form casregister YA. Stats Y4, Ү5, Y6, Y7, and parity P2 form casregister 
YB. Maskable interrupt stat Y9 is set when an mpx interrupt occurs. Manual stat 
Y10 allows manual loading and display of registers and storage. Error stat Y12, 
when set, causes a machine hardstop in case an error occurs. Load stat Ү15 is set 
when the load button is pressed. Casregisters YD and YE are similarly formed but 
with no parity. The 16 Y-stats form casregister Y. 


9.3.5 Buses 


There are four buses in the data flow. Buses BUSP and BUSQ serve as the two 
input buses to the ALU via the ALU operand registers P and Q, respectively. Regis- 
ters A, B, S, T, and W are connected to bus BUSP, and registers A, B, C, and D are 
connected to bus BUSQ. Bus BUSALU is the output bus of the ALU. It connects 
to registers А, B, S, T, and W. Bus К connects registers А, B, C, D, S, T, W, ROAR, 
and the input terminal from the multiplexor channel. It is divided into three parts: 
BUSRO, BUSRI, and BUSRX. These buses are described below. 


Comment, description of the buses (9.11) 
Bus, BUSP(0-7,P), $ALU P bus 
BUSQ(0-7,P), $ALU Q bus 
BUSALU(0-7,P), SALU output bus 
BUSRO(0-7,P), А $first byte of bus К 
BUSRI1(0-7,P), $second byte of bus R 
BUSRX(5-7,P), ‚ Sextender bits of bus К 


Terminal, BUSR-—BUSRX-BUSRO-BUSRI, $bus R 


Sec. 9.3 А Computer Organization 421 


9.3.6 Channels 


There are one multiplexor channel, or MCH, and optionally two selector chan- 
nels, or SCH. The MCH has a maximum of 128 subchannels which allow concurrent 
operation of 128 relatively low-speed I/O devices such as the card reader and printer. 
It can operate either in the multiplex mode or in the burst mode. In the multiplex 
mode, one I/O unit is logically connected to the channel only for the time required 
to transmit one byte of data. For this reason, the multiplex mode is also called the 
byte mode. Between data bytes, the unit is disconnected and the channel is free to 
transmit single bytes to or from other units. In the burst mode, one I/O device remains 
logically connected to the channel for the entire I/O operation. 

When the MCH operates in the byte mode, the CPU time available gradually 
decreases with the increase of the combined data rate of the MCH until no time is 
available when the combined rate reaches 26,000 bytes per second. When the MCH 
operates in the burst mode, all facilities of the CPU are taken over for channel opera- 
tion; the CPU cannot process concurrently with the I/O device. The maximum data 
rate in this mode is 228,000 bytes per second. These data rates are shown in Table 9.8. 


TABLE 9.8 Channel Maximum Data Rates on Model 40 
(in bytes per second) 


CHANNEL MULTIPLEXOR Burst MODE 
MODE 
Multiplexor 26,000 228,000 
быс Опе 400,000 
Two — 300,000 for either 


The SCH's are provided for 1/O devices with relatively high data rates. Since 
there is only one subchannel, only one I/O control unit can be connected to the SCH 
at any one time, and the SCH operates only in the burst mode. 

There are two types of selector channels, A and B. The SCH of type A time- 
shares the CPU registers and data paths as well as microprogram control to a high 
degree. The data are not transferred directly between the main storage and the inter- 
face, but move serially bytewise via а 16-Буе buffer in the local storage. When the 
16-byte buffer is half full, the SCH requests the use of the CPU registers and the 
data paths. The contents of the CPU registers are dumped into the local storage. 
Up to 12 read-only-storage (ROS) cycles are required to preserve the current contents 
of the CPU registers and to load the control word. The bytes in the buffer are then 
transferred to the main storage at two bytes per main-storage cycle. 

The dumping of the CPU registers for MCH and SCH works as follows. When 
an interrupt from the MCH occurs, all data relevant to the current CPU operation 
are dumped to the first-level dump area in the local storage (see Table 9.7). 1f SCH 
interrupt subsequently occurs, all data relevant to the MCH operation are dumped 
to the second-level dump area in the local storage. The SCH is then serviced. When 


422 Chap.9 COMPUTER ORGANIZATION 


all SCH operations are complete, the data for the MCH are restored and the MCH 
is serviced. When the MCH service is complete, the CPU data are restored and the 
CPU resumes its processing. 

The SCH of type B has its own circuitry and thus does not time-share the CPU; 
this allows the CPU to process concurrently. It does not use the local storage as a 
buffer in order to eliminate-the access to the local storage. Data are transferred to 
the main storage as soon as two bytes are accumulated. When there are two SCH’s 
operating concurrently, the maximum data rate for each SCH is 300,000 bytes per 
second. When there is only SCH, the maximum data rate for the SCH is 400,000 
bytes per second. These data rates are also shown in Table 9.8. Further discussion 
on the SCH will be limited to type B. 


9.3.7 |/O Interface 


The I/O control units are connected to the channel via an I/O interface. This 
interface is standardized so that any I/O control unit can be connected to any channel. 
In this way, practically all of the IBM I/O devices become available to model 40. 

The standard interface is physically a data path made up of a set of wires. It 
can transmit one byte of information at a time between the channel and the I/O 
control unit. Electrical specifications (signal levels, line drivers, and terminators) 
for all I/O units connected to the interface are identical. Communication with any 
type of I/O device makes use of the same interface language. Therefore, the I/O 
control units have to interpret this language into the actual control signals required 
by their attached I/O devices and vice versa, and the channels have to translate the 
interface language into the control signals of the particular CPU, and vice versa. 


9.3.8 ШО Control Units and Devices 


I/O control units operate the I/O devices and translate the device language into 
channel language, and vice versa. Three commonly-used I/O control units for model 
40 are: integrated control unit, magnetic tape control unit, and disk file control unit. 
The integrated control unit controls card reader, card punch, and printer for concur- 
rent operations. It translates the internal data code into card code, and vice versa. 
It provides data buffers for a complete card or print line. The magnetic tape control 
unit controls up to eight magnetic tape drives. The file control unit controls up to 
eight disk drives. It translates the data in serial bits into the data in bytes, and vice 
versa. 

An I/O device is one which transports a storage medium such as card or magnetic 
tape and performs a reading operation from the medium or performs a writing opera- 
tion to the medium. Examples of the commonly-used I/O. devices are the console 
typewriter, the card reader/punch, the magnetic tape storage, drum storage, and 
disk storage with either permanent disks or interchangeable disk packs. 

An I/O control unit need not be a separate unit from an I/O device. The control 


Sec. 9.4 Processing Unit 423 
unit for the IBM 1443 printer is a part of the printer, the control unit for the console 


typewriter is a part of the console, and a magnetic tape control unit may contain one 
or more magnetic tape drives. 


9.3.9 Channel-to-channel Adapter 


As shown in Fig. 9.7, an IBM channel-to-channel adapter connects the I/O 


Channel 


Control Control 
unit unit 


Fig. 9.7 Channel to channel connection 


Channel-to-channel 
adapter 


interfaces of two channels, normally of two different computer systems, in such a 
way that each channel appears as a control unit to the other. Since the I/O interface 
of the IBM System/360 family of computers is standardized, the channel-to-channel 
adapter may connect one model of the family to any other model. 

The adapter operates in the burst mode. It transmits data one byte in width 
at a rate set by the two channels. If the two channels are in one computer system, 
the adapter permits the moving of blocks of data from one area in the main storage 
to another. If they are of two computer systems, it interconnects two СРО? of the 
two systems and thus gives a multiprocessing system. 


9.4 Processing Unit 


The processing unit is the place where instructions are executed. Instead of 
describing each instruction, this section discusses the instruction set, the general 
purpose registers, the processing modes, the processing operations, and the code 
translation. 


9.4.1 Instruction Set 


A list of the instruction set is shown in Tables 9.9-9.12. These instructions may 
be divided into four types: data handling instructions, branch instructions, I/O 
instructions, and system control instructions. Data handling instructions perform 


424 


XXXX 


0000 
0001 
0010 
0011 
0100 
0101 
0110 
0111 
1000 
1001 
1010 
1011 
1100 
1101 
1110 
1111 


XXXX 


0000 
0001 
0010 
0011 
0100 
0101 
0110 
0111 
1000 
1001 
1010 
1011 
1100 
1101 
1110 
1111 


Chap. 9 COMPUTER ORGANIZATION 


TABLE 9.9 Operation Code, RR Format 


BRANCHING AND 
STATUS SWITCHING 


0000хххх 


Set program mask 
Branch and link 
Branch on count 
Branch/condition 
Set key 

Insert key 
Supervisor call 


TABLE 


FIXED-POINT 
HALFWORD 
AND BRANCHING 


0100xxxx 


Store 

Load address 
Store character 
Insert character 
Execute 

Branch and link 
Branch on count 
Branch/condition 
Load 

Compare 

Add 

Subtract 
Multiply 


Convert-decimal 
Convert-binary 


Note: N=normalized 
SL —single logical 
DL —double logical 


FIXED-POINT 
FULLWORD 
AND LOGICAL 


0001 xxx 


Load positive 
Load negative 
Load and test 
Load complement 
AND 

Compare logical 
OR 

Exclusive OR 
Load 

Compare 

Add 

Substract 
Multiply 

Divide 

Add logical 
Subtract logical 


FLOATING-POINT 
LONG 


0010хххх 


Load positive 
Load negative 
Load and test 
Load complement 
Halve 


Load 
Compare 
Add N 
Subtract N 
Multiply 
Divide 
Add U 
Subtract U 


9.10 Operation Code, RX Format 


FIXED-POINT 
FULLWORD 
AND LOGICAL 


0101xxxx 


Store 


AND 

Compare logical 
OR 

Exclusive OR 
Load 

Compare 

Add 

Subtract 
Multiply 
Divide 

Add logical 
Subtract logical 


FLOATING-POINT 
LONG 


0110хххх 


Store 


Load 
Compare 
Add N 
Subtract N 
Multiply 
Divide 
Add U 
Subtract Ц 


U-unnormalized 


S —single 
D — double 


FLOATING-POINT 
SHORT 


001 1xxxx 


Load positive 
Load negative 
Load and test 
Load complement 
Halve 


Load 
Compare 
Add N 
Subtract М 
Multiply 
Divide 
Add U 
Subtract U 


FLOATING-POINT 
SHORT 


0111хххх 


Store 


Load 
Compare 
Add N 
Subtract N 
Multiply 
Divide 
Add U 
Subtract U 


Sec. 9.4 Processing Unit 425 


TABLE 9.11 Operation Code, RS and SI Formats 


BRANCHING FIXED-POINT 
STATUS SWITCHING LOGICAL AND 
AND SHIFTING INPUT/OUTPUT 

XXXX 1000xxxx 1001 xxxx 1010xxxx 1011хххх 
0000 Set system mask Store multiple 
0001 Test under mask 
0010 Load PSW Move 
0011 Diagnose 
0100 Write direct AND 
0101 Read direct Compare logical 
0110 Branch/high OR 
0111 Branch/low-equal Exclusive OR 
1000 Shift right SL Load multiple 


1001 Shift left SL 
1010 Shift right S 
1011 Shift left S 


1100 Shift right DL Start I/O 
1101 Shift left DL Test I/O 
1110 Shift right D Halt 1/O 
1111 Shift left D Test channel 


TABLE 9.12 Operation Code, SS Format 


LOGICAL DECIMAL 
XXXX 1100xxxx 1101xxxx 11 10xxxx 1111xxxx 
0000 
0001 Move numeric Move with offset 
0010 Move Pack 
0011 Move zone Unpack 
0100 AND 
0101 Compare logical 
0110 OR 
0111 Exclusive OR 
1000 Zero and add 
1001 Compare 
1010 Add 
1011 Subtract 
1100 Translate Multiply 
1101 Translate and test Divide 
1110 Edit 
1111 Edit and mark 
"LP Rude e ERN UE AR 
Notes: ЇЧ — normalized U=unnormalized 
SL=single logical S=single 


DL=double logical D=double 


426 Сһар. 9 COMPUTER ORGANIZATION 


arithmetic and logical operations. These include fixed-point, floating-point, decimal 
arithmetic instructions, and logical instructions. Branch instructions allow changes in 
sequencing program instructions. These include branch on condition instructions, 
branch and link instructions, branch on count instructions, and the execute instruc- 
tion. There are four I/O instructions: START I/O, HALT I/O, TEST I/O, and TEST 
CHANNEL. System control instructions control the overall system status. They 
are: load PSW, set program mask, set system mask, supervisor-call, set storage key, 
insert storage key, write direct, read direct, and diagnose. 


9.4.2 General-purpose Registers 


The use of general purpose registers for temporary addresses or data can increase 
the processing speed because it reduces the access to the main storage. The processing 
unit has one set of 16 32-bit general-purpose registers called general registers and 
another set of four optional 64-bit floating-point registers. The floating-point registers 
temporarily hold the floating-point numbers for multiple floating-point arithmetic 
operations. The general registers can be used as index registers, relocation registers, 
and accumulators for fixed-point arithmetic or logical operations. These registers 
are addressed by a 4-bit address. If these registers are to be used for another purpose 
during the processing of a program, their contents can be stored in the main memory, 
and these registers can then be loaded with some other values. Single instruction is 
available for loading or storing the set of registers. As mentioned previously, a 4-bit 
R field in the instruction addresses the general registers. This field may also address 
the floating-point registers; this is specified by the operation code. 

The general and floating-point registers for model 30 are a part of the main 
memory. For models 40 and 50, they are another but faster core memory. For models 
65, 67, and 75, they are transistor registers for still faster operation. 


9.4.3 Processing Modes 


There are three processing modes: register-to-register, storage-to-register, and 
storage-to-storage. In the storage-to-storage mode for model 40, the operand is brought 
out of the main storage two bytes at a time and operated on one-byte at a time; the 
result is stored into the main storage two bytes at a time. The operand is of variable 
length, which can be one to 16 bytes for a decimal number or one to 256 bytes for a 
character string. 

In storage-to-register mode, one of the two operands is in the main storage and 
the other is in the register; they are binary operands. These operands are transferred 
to the inputs of the arithmetic and logical network to be operated upon, and the result 
replaces the operand in the register. In the register-to-register mode, both operands 
are also binary and in the registers. This mode makes the main storage available to 


Sec. 9.4 Processing Unit 427 


serve another system unit. The operands for these two modes are of fixed length. 
They can be halfwords, fullwords, or double words, but they must observe address 
boundary alignment. 


9.4.4 Processing Operations 


There are four classes of processing operations: fixed-point arithmetic, floating- 
point arithmetic, decimal arithmetic, and logical operations. 

For fixed-point arithmetic operations, the operand is the 32-bit fixed-point binary 
numbers. For better speed or storage utilization, halfword operands may be specified 
in most operations. Some products and all dividends are 64 bits long, using an even- 
odd register pair. Since the 32-bit word length can accomodate the 24-bit address, 
the entire fixed-point instruction set is used in address computation. 

As mentioned previously, the floating-point number may occur in either a short 
format or a long one. The short format gives a precision of about seven decimal 
digits, while the long format a precision of about 17 decimal digits. Both fixed-point 
and floating-point operations are performed with one operand from a register, and 
another from either a register or the main storage. The result, which is placed in the 
register, is generally the same length as the operands. 

Decimal arithmetic includes addition, subtraction, multiplication, division, and 
comparison. Decimal operations are performed with both the operands and the 
result in the main storage. Decimal numbers for decimal operations are in the packed 
decimal format. The packing of digits enables efficient use of storage, increases 
arithmetic performance, and improves the rate of data transfer. 

Operations for comparison, translation, editing, bit testing, and bit setting are 
provided for processing logical operands of fixed length or variable length. Logical 
operations on operands of variable length are processed from left to right, one byte 
at a time. Two scanning instructions permit byte-to-byte translation and testing by 
means of tables. An important class of logical operations for variable-length operands 
tests, sets, resets, and complements individual bits of a one-byte field by means of an 
8-bit mask in the instruction. 


9.4.5 Code Translation 


When numerical data from a character-set-sensitive input device coded in the 
EBCDIC code are read in, they are translated into the packed decimal format by the 
PACK instruction. Binary arithmetic is used if the numerical data are of fixed length, 
while decimal arithmetic is used if they are of variable length. If binary arithmetic is 
used, the convert-to-binary instruction translates the data into the fixed-point format, 
and the data are then processed by fixed-point arithmetic operations. After processing, 
the convert-to-decimal instruction and either an unpack or an edit instruction is used 
to translate the data into the EBCDIC code for an output device such as printer or 


428 Chap. 9 COMPUTER ORGANIZATION 


EBCDIC input 


Convert to packed 
. decimal format 


Convert to 
binary 


Process 


with binary Binary output 


Convert to packed 
decimal format 


Packed decimal output 


EBCDIC output EBCDIC output 


Fig. 9.8 Fixed-point arithmetic processing sequence on EBCDIC 
input 


punch (Fig. 9.8). If decimal arithmetic is used, the convert instructions can be omitted 
and the data are processed by the decimal arithmetic operations; this is illustrated in 
Fig. 9.9. - un 

In case the input is in a different code (up to eight bits) such as teletype code, 
the code translation may be performed by using the translation table in the memory. 
А translate instruction permits the conversion of a string up to 256 characters. 


Sec. 9.5 Main Control Unit 429 


EBCDIC input 


Convert to packed 
decimal format 


Process with 
decimal 
instruction set 


Packed decima! output 


EBCDIC output EBCDIC output 


Fig. 9.9 Processing sequence using the decimal instruction set 
on EBCDIC input 


9.5 Main Control Unit 


The main control unit performs the functions of sequencing instructions in a 
program, switching from one program to the other and handling the interrupts 
when they occurs. This section describes how these functions are carried out. 


9.5.1 Instruction sequencing 


The central control unit usually sequences the instructions of a program by 
taking the next instruction from the main storage addressed by the instruction counter. 
After an instruction is fetched, the instruction counter is increased by the number of 
bytes in the instruction; the counter now contains the next instruction address. This 
normal sequencing can be changed by branch instructions or by an interrupt. Sequenc- 
ing change by interrupt is to be described subsequently. 

Sequencing change by branch instructions is most: often accomplished by a 
branch on condition instruction which inspects a 2-bit condition register Y2-Y3. 


430 Сһар.9 COMPUTER ORGANIZATION 


Many of the arithmetic, logical, and 1/O operations indicate an outcome Бу setting 
the condition register to one of its four possible states. Subsequently, a conditional 
branch can select one of the states as a criterion for branching. For example, the 
condition code indicates such conditions as overflow, equal, nonzero, or channel 
busy. 

The outcome of address arithmetic and counting operations can be tested by a 
conditional branch to control a loop. Two instructions, branch on count and branch 
on index, provide for one-instruction execution of the most common arithmetic-test 
combinations. Subroutine linkage is provided by the branch and link instruction. 
The execute instruction permits execution of a single instruction that is not in the 
current instruction sequence. 


9.5.2 CPU Status 


The CPU status is designated by four sets of states: the stop or operate state; 
the run or wait state; the masked or interruptible state; and the supervisor or problem 
state. 

The operate state indicates that the CPU is capable of executing instructions, 
the interrupts are accepted, and the timer is updated. If the CPU is in the stop state, 
instructions are not executed, interrupts are not accepted, and the timer is not up- 
dated. These states are controlled manually. 

The run state indicates that instructions are being executed by the CPU in the 
normal manner. If the CPU 15$ in the wait state, no instruction is executed, but the 
timer is updated and the I/O and external interrupts are accepted unless masked. 
The wait state is normally entered by the program to wait for an interruption such as 
I/O interrupt or console operator intervention. 

When the CPU is in the interruptible state, it allows interrupts. Most of the 
interrupts can be masked out. 


TABLE 9.13 Privileged Instructions 


GROUP PniviLEGED INSTRUCTION 

PSW operation Load PSW 

Set system mask 
I/O operation Start I/O 

Stop I/O 

Test I/O 

Test channel 
Storage protection Insert storage key 

Set storage key 
Direct control Read direct 

Write direct 


Diagnosis Diagnose 


M 


Sec. 9.5 Main Control Unit 431 


When the CPU is in the supervisor state, all instructions are valid and can be 
executed. If the CPU is in the problem state, a group of instructions called the privi- 
leged instructions are invalid and the execution of any of these instructions causes 
a program check interrupt. The set of privileged instructions is divided into five 
groups and shown in Table 9.13. 


9.5.3 Program Status Word 


Program status refers to the overall system status of a problem program. All 
pertinent information to indicate and control the program status is stored in the 
double word called program status word, or PSW. The PSW format is shown in Fig. 
9.10. The basic idea of using the PSW is to monitor by hardware machine status so 
that the CPU can readily perform program switching. 


System Interrupt | Instruction |Condition| Program |Instructio 
AWME length code code mask address 
0 78 


п 
11 12 15 16 31 32 33 34 35 36 3940 63 


System mask 
Multiplexor channel mask 
Selector channel 1 mask 
Selector channel 2 mask 
Selector channel 3 mask 
Selector channel 4 mask 
Selector channel 5 mask 
Selector channel 6 mask 
External/timer mask 
8-11 Protection key 
12 ASCII mode (A) 
13 Machine check mask (М) 
14 Wait state (W) 
15 Problem state (P) 
16-31 Interrupt code 
32-33 Instruction length code (ILC) 
34-35 Condition code (CC) 
36-39 Program mask 
36 Fixed-point overflow mask 
37 Decimal overflow mask 
38 Exponent underflow mask 
39 Significance mask 
40-63 Instruction address 


мостом о м 


Fig. 9.10 Program status word format 


There are eight fields in the PSW: system mask (0-7), protection key (8-11), 
AMWP (12-15), interruption code (16-31), instruction length code (32-33), condition 
code (34-35), program mask (36-39), and next instruction address (40-63). The 
designations of these fields are now described. 


432 


Chap. 9 COMPUTER ORGANIZATION 


. System Mask. A біп bit 0 masks out (i.e., prevents interrupts from happening) the 


interrupts from the multiplexor channel. A 0 in bits 1 to 6 masks out the interrupts 
from the selector channels 1 to 6, respectively. A 0 in bit 7 masks out the interrupt 
from the external/timer interrupt. 


. Protection Key. This is a 4-bit storage key assigned to the program which the PSW 


represents. Ж 


. AMWP. A, М, W, and P represent bits 12 to 15, respectively. A 1 in bit A indicates 


that the ASCII-8 code is being used; otherwise, it indicates the, EBCDIC code. A 0 
in bit M masks out the machine check interrupt. А 1 in bit W indicates that the CPU 
is in the wait state; otherwise, it is in the run state. A 1 in bit P indicates that the 
CPU is in the problem state; otherwise, it is in the supervisor state. 


. Interrupt Code. The interrupt code identifies the source of interrupt. Designation 


of the code varies for each of the five classes of interupts. These interpretations, as 
well as the mask bits of the PSW when applicable, are shown in Table 9.14. 


. Instruction Length Code. The instruction length code indentifies the instruction 


length of the last instruction executed; this information is required for some inter- 
rupts to locate the instruction that was being interpreted when the interrupt occurred. 


TABLE 9.14 Designation of the PSW Interrupt Code 


INTERRUPT CODE MASK 
INTERRUPT SOURCE (PSW Brrs 16-31) (PSW Bir) 
I/O Interrupt 
Multiplex channel 00000000 aaaaaaaa 0 
Selector channel 1 00000001 aaaaaaaa 1 
Selector channel 2 00000010 aaaaaaaa 2 
Selector channel 3 0000001 1 aaaaaaaa 3 
Selector channel 4 00000100 aaaaaaaa 4 
Selector channel 5 00000101 aaaaaaaa 5 
Selector channel 6 00000110 aaaaaaaa 6 
Program Interrupt 
Operation 00000000 00000001 
Privileged operation 00000000 00000010 
Execute 00000000 0000001 1 
Protection 00000000 00000100 
Addressing 00000000 00000101 
Specification 00000000 00000110 
Data 00000000 00000111 
Fixed-point overflow 00000000 00001000 36 
Fixed-point divide 00000000 00001001 
Decimal overflow 00000000 00001010 37 
Decimal divide 00000000 00001011 
Exponent overflow 00000000 00001100 
Exponent underflow 00000000 00001101 38 
Significance 00000000 00001110 39 
Floating-point divide 00000000 00001111 


Supervisor Call Interrupt 
Instruction Bits 00000000 Y TITIITIT 


Sec. 9.5 Main Control Unit 433 


TABLE 9.14 (Continued) 


External Interrupt 


External signal 7 00000000 хххххххі 7 
External signal 6 00000000 ххххххіх 7 
External signal 5 00000000 хххххіхх 7 
External signal 4 00000000 ххххіххх 7 
External signal 3 00000000 хххіхххх 7 
External signal 2 00000000 ххіххххх 7 
Interrupt key 00000000 xIxxxxxx 7 
Timer 00000000 1xxxxxxx 7 
Machine Check Interrupt 
Machine Malfunction сссссссс сссссссс 13 


Note: аааааааа denotes the device address of the I/O unit that caused the interrupt 
rrrrrrrr denotes bits 8-15 of the SVC instruction 
сссссссс model-dependent code 


6. Condition Code. The condition code is set by certain arithmetic, compare, logical, 
and J/O instructions. Designation of the code is different for each individual instruc- 
tion. In order to give some idea of the code, the designations for each class of opera- 
tions are shown in Table 9.15. 


TABLE 9.15 Designation of the PSW Condition Code 


CONDITION CODE 


INSTRUCTIONS 

00 01 10 11 
Arithmetic Operations 0 <0 >0 Overflow 
Compare Operations Equal Low High — 
Logical Operationst 0 #0 — — 
IJO Operations: Available Channel Busy or Not 

status terminated operational 
available 


+The instructions are for logical AND, OR, XOR operations. 
tThe condition code refers to the addressed unit. 


7. Program Mask. The program mask allows the program to mask out the fixed-point 
overflow interrupt (bit 36), decimal overflow interrupt (bit 37), exponent underflow 
interrupt (bit 38), and significance interrupt (bit 39). 

8. Instruction Address. This is the next instruction address. When this address is first 
placed in the PSW, it is the first instruction address. It is updated during the fetching 
of each instruction. It is replaced by a branch address after the execution of a suc- 
cessful branch instruction. 


Each of the five classes of interrupts has two locations for storing two program 
status words, the old PSW and the new PSW’s. These are the permanent locations in 


434 Chap.9 COMPUTER ORGANIZATION 


the main storage at addresses 24, 32, 40, 48, 56, 88, 96, 104, 112, and 120 as shown 
in Table 9.16. 


TABLE 9.16 Permanent Storage Assignment 


ADDRESS BYTE LENGTH PURPOSE 
0 8 Initial program loading PSW 
8 8 Initial program loading CCW 1 
16 8 Initial program loading CCW 2 
24 8 External old PSW 
32 8 Supervisor call old PSW 
40 8 Program old PSW 
48 8 Machine check old PSW 
56 8 Input/output old PSW 
64 8 Channel status word 
72 4 Channel address word 
76 4 Unused 
80 4 Timer 
84 4 Unused 
88 8 External new PSW 
96 8 Supervisor call new PSW 
104 8 Program new PSW 
112 8 Machine check new PSW 
120 8 Input/output new PSW 
128 Diagnostic scan-out areat 


+The size of the diagnostic scan-out area is configuration dependent. 


9.5.4 Interrupt 


An interrupt is a hardware scheme which, when it occurs, allows a switching 
from one program to the other. In this section, we describe the types, the maskability, 
and the priority of interrupts as well as the mechanism of program switching. 

There are five types of interrupts: (a) machine error, (b) external, (c) 1/O, (d) 
program error, and (е) supervisor call. A machine error (or machine check) interrupt 
occurs when a hardware malfunction happens. This interrupt initiates a routine 
which locates the fault and performs a scanout (automatically records the status of 
the hardware into the memory) of the status of the CPU in the PSW at location 128 
(see Table 9.16). The external interrupt allows the CPU to respond to external signals 
from such external devices as interrupt key on the system control panel and the 
hardware timer. The I/O interrupt allows the CPU to respond to special conditions 
occurring in the channels or in I/O units. The program error (or program check) 
interrupt occurs when an unusual condition happens in the problem program such as 
overflow, improper divide, lost significance, illegal op-codé, privileged instruction, 
or violation of memory protection. The supervisor-call interrupt takes place when 
the supervisor-call instruction in a problem program is executed; this results in a 
switching from the problem state to the supervisor state. 


Sec. 9.5 Main Control Unit 435 


The first type of interupt is maskable by the PSW bit 13. The second type is 
maskable by the PSW bit 7. The third type is maskable by the PSW system mask 
(bits 0-6). The fourth type has a subset which is maskable by the PSW program mask 
(bits 36-39). The fifth type does not need to be maskable and is not maskable. 


Interrupt signal 


If interruptible, CPU completes, 
terminates, or suppresses 
current instruction 


Interruption 
begins 
Set instruction-length 
code and interrupt code 
Store old PSW 
Fetch new PSW 
Interrupt supervisor 
handles interrupt 
Analyze interrupt cause 
and take appropriate action 
Return to supervisor 
and load old PSW 


Return to problem 
program 


Fig. 9.11 Interrupt action 


436 Chap.9 COMPUTER ORGANIZATION 


During the execution of an instruction, if more than one interrupt of different 
types occurs at the same time, only one interrupt can be accepted at any one time. 
Simultaneous interrupts are accepted in the following order of priority: 


1. Machine check interrupt 

2. Program check or supervisor-call interrupt 
3. External interrupt 

4. I/O interrupt 


Note that the program-check and supervisor-call interrupts are mutually exclusive 
because they cannot occur at the same time. 

The CPU action after an interrupt occurs is shown in the flowchart in Fig. 9.11. 
As shown in the figure, when an interrupt occurs, the CPU determines whether it 
is interruptible for the class of interrupts. If it is, the current instruction is allowed 
to be completed, terminated, or suppressed, and the interrupt action begins. The 
CPU sets the instruction length code and the interrupt code in the current PSW 
(which controls the current problem program). It then stores the current PSW into 
the old PSW location and fetches the PSW from the new PSW location to Бесоше 
the current PSW. For example, the old and new PSW's in the program are located 
at main memory locations 40 and 104 respectively as shown in Table 9.16. This 
exchange of the PSW’s is referred to as PSW switching as shown in Fig. 9.12. The 


Old PSW 
New PSW 


Main storage 


Current PSW 


Fig. 9.12 PSW switching during interrupt 


new PSW locates the interrupt routine for the particular class of interrupts. The 
interrupt routine which resides permanently in the main memory now proceeds to 
analyze the interrupt cause from the PSW in the old PSW location and then takes 
the appropriate action. After the interrupt has been handled, the last instruction of 
the interrupt routine is a load PSW instruction which recalls the PSW in the old PSW 
location and makes it the current PSW so that the original problem program can now 
be resumed. In the case of multiprogramming, the load PSW instruction may call a 
different PSW. 


Sec. 9.5 Main Control Unit 437 


During the execution of an interrupt routine, another interrupt may occur. This 
second interrupt would cause the current PSW for the first interrupt routine to be 
stored. Thus, the old PSW for the original problem program is lost, and there would 
be no way to return to the problem program. Therefore, interruption during the execu- 
tion of an interrupt routine by another interrupt of the same class must be prevented. 
This prevention is done by masking the proper bits in the new PSW. By means of 
masking, the interruption can be held pending a later execution. As mentioned, the 
PSW provides the system mask (bits 0-7), the machine check mask (bit 13), and the 
program mask (bits 36-39). When the mask bits for particular types of interrupts are 
made 0, these types are prevented from occurring or masked. When the mask bits 
are set to 1, on the other hand, the CPU is interruptible for the corresponding types 
of interrupts. 


9.5.5 Microprogram 


As previously mentioned, there is a read-only storage (ROS). Each ROS word 15 
а micro-instruction, and the aggregate of micro-instructions form the microprogram. 
The microprogram and the associated circuitry form the heart of the main control 
unit of model 40. The micro-instruction may also be referred to as the control word. 
Each control word has 56 bits and is divided into 20 fields. 

A sequence of micro-instructions form a micro-subroutine. There are many 
micro-subroutines in the microprogram. Several micro-subroutines are now mention- 
ed. The fetch micro-subroutine reads the instruction to be executed from the main 
storage located by the instruction counter (which is the instruction address field of 
the current PSW), decodes the op-code, branches to the appropriate micro-subroutine, 
and then updates the instruction counter. The dump and undump micro-subroutine 
stores the CPU data flow in the local storage dump area, or vice versa. The initial 
program-load (IPL) micro-subroutine loads 24 bytes of data from the I/O device 
selected by the load switch on the console into the main storage locations 0 to 23, 
and initiates command chaining to the CCW at location 8. The selector-channel 
buffer-service micro-subroutine “steals storage cycle” to transfer the data from the 
channel to the main storage, or vice versa. 


9.5.6 Interrupt Supervisor 


As shown in Fig. 9.11, after the PSW’s are switched, the interrupt supervisor 
handles the interrupt according to the type of interrupt. If it is an 1/O interrupt, the 
interrupt supervisor transfers it to the I/O supervisor which consists of the routine 
that handles 1/O interrupts. If the I/O interrupt indicates a satisfactory completion 
of an I/O operation, the 1/О supervisor starts a pending I/O operation and then returns 
control to the problem program. If the I/O interrupt indicates an error, all interrupts 
are masked and control is turned to the error recovery routine for the particular 
1/O device. In case recovery is not possible, the error is ignored, the record is bypassed, 
or the control is transferred to a user’s routine, or the job is terminated. 


438 Сһар. 9 COMPUTER ORGANIZATION 


If it is а machine check interrupt, the interrupt supervisor places the CPU in 
the wait state, allowing the operator to load and execute a stand-alone program that 
will preserve the contents of the scan-out area and other available information in 
printed output or in records on disk or drum. 

Since the supervisor-call interrupt is caused by a request from the problem 
program via the supervisor-call (SVC) instruction, the interrupt supervisor transfers 
the control to the SVC interrupt routine. This routine examines the interrupt code 
supplied with the SVC instruction and transfers control to the proper routine to 
handle the request. Examples of the proper routines are the routine to load a program 
from the core image library into storage for execution, the routine to provide com- 
munication between the operator and the problem program, and the routine to ter- 
minate the program and prepare for the next job to be run. 

If the external interrupt is caused by the interval timer, the control is turned to 
а user-supplied timer routine for handling; otherwise, the interrupt is ignored. If the 
external interrupt is caused by the interrupt key on the console, the control is trans- 
ferred to the operator communication routine which handles messages directed to the 
supervisor by the operator, or messages issued to the operator by the supervisor. 

In the event of a program check interrupt, the interrupt supervisor transfers 
control to the address of a user-supplied “program-error fixup routine”; otherwise, 
the problem program is terminated. 


9.6 Supervisor and Other Controls 


As the speed of a digital computer increases, the need for the fully automatic 
system likewise increases in order to fully utilize the capability of the computer. 
Thus, it is imperative that the computer system be controlled by a supervisor program 
and that manual control be kept to a minimum. This section describes a number of 
hardware features that facilitate the supervisor control. The hardware features are: 
storage protection to ensure survival] of the supervisor program, an interval timer to 
return control to the supervisor periodically, and the wait state available to the 
supervisor rather than the use of a stop or halt instruction available to the problem 
programmer. In addition, the direct control and the initial program loading аге 
briefly described. 


9.6.1 Storage Protection 


For the protection purpose, the main storage is divided into blocks, each with 
2,048 bytes. Each block has a 4-bit storage-key register and these four bits are called 
the storage key, which may be regarded as the lock of the block. The storage key for 
each block is inserted by the supervisor program. 

To open the lock requires a protection key. A protection key is assigned to each 
program and stored in its PSW bits 8-11. The protection key for a channel program 


“ 


Sec. 9.6 Supervisor and Other Controls 439 


is stored in its САУУ bits 0-3. When the data are stored іп a storage block, the storage 
key and the protection key are compared. When they are equal or when either is 0, 
they are “matched,” and the block is allowed for storing. When a mismatch happens 
as a result of attempting execution of an instruction in its protected area, execution 
of the instruction is suppressed or terminated and an interrupt is generated. When a 
mismatch happens as a result of carrying out an I/O operation, data transmission is 
terminated. In either case, the contents of the main storage are protected (i.e., remain 
unchanged). 

As an example, let blocks A, B, C, D, and E in Fig. 9.13 have storage keys of 


Block A 
2048 bytes 


Block B 
2048 bytes 


Block C 
2048 bytes 


Block D 
2048 bytes 


Block E 
2048 bytes 


Storage Fetch 
Main key key 
storage registers register 


Fig. 9.13 Storage protection 


5, 0, 7, 5, and 7, respectively. The protection key in the PSW of the supervisor program 
is 0; storing of words in any of these five blocks is permitted. If the protection key 
in the PSW of a problem program is 5, then storing of words in blocks A and D 15 
permitted, while storing in blocks B, C, and E results in a program check interrupt. 
If the protection key in the CAW is 7, then data transmission into blocks C and E 
is permitted, but data transmission into other blocks results in an I/O interrupt. 

The protection described above is for store protection. On some models of the 
IBM System/360 computers, fetch protection is additionally available. For fetch 
protection, a single-bit fetch key register is provided for each 2,048-byte block. When 
the key register contains a 0, the block is protected against the storing of information 
in the block. If the key register contains a 1, the block is protected against the fetching 
of information from the block as well as the storing of information in the block. 
As an example, blocks A, B, C, D, and E in Fig. 9.13 have fetch keys of 1, 1, 0, 1, 
and 0, respectively. Blocks A, B, and D are for both store and fetch protection, while 
blocks C and E are for only store protection. 

The storage protection above allows the supervisor program to make changes 
in any block of main storage, while the problem program allows changes only in its 
own assigned blocks. 


440 Chap.9 COMPUTER ORGANIZATION 


9.6.2 Interval Timer 


The full word in main memory location 80 represents the time of an interval 
timer. The maximum interval is 15.5 hours. This time is set by the supervisor program 
and counted down at a rate of 50 (or 60) cycles per second. When the time of the 
interval timer goes from positive to negative, an external interrupt is generated to 
signal the CPU. This interval timer can be used for job counting by measuring the 
duration of time for each job, for stopping a runaway job, for time-of-the-day record- 
ing, and for polling a communication network at every certain time interval. 


9.6.3 Wait State 


When the PSW bit 14 is 1, the CPU is in the wait state. It makes no memory 
reference and executes no instructions, but it is responsive to any possible interrupt, 
providing immediate attention when called upon. Thus, provision of the wait state 
allows the CPU to remain controlled by the system when there is no problem program 
to be executed. 


9.6.4 Direct Control 


The privileged instructions; READ DIRECT and WRITE DIRECT, are pro- 
vided for transmitting a single-byte of data between an external device and the main 
storage without making use of the channel because a direct data path is provided for 
the external device. 

When a direct control device has a byte of data to transmit, it signals an interrupt 
on one of the six external lines. This sets the appropriate external interrupt register 
and also the external interrupt request register. If the external mask bit (the PSW 
bit 7) is 1, this interrupt will be accepted. The CPU, in accepting the interrupt, deter- 
mines the source of interrupt and transfers the control to the direct control mi- 
crosubroutine which makes use of both READ DIRECT and WRITE DIRECT 
instructions. 


9.6.5 Initial Program Loading 


The initial program loading (IPL) is a hardware bootstrap loader to start the 
loading for the first program into the main memory. The IPL is initiated by pressing 
the load button. After the load button is pressed, the following sequence of steps 
occurs: 


1. A CCW which specifies the command (read, command chaining, data address =0, 
count =24) is generated in main storage location 0. 


2. A start I/O instruction is initiated on the device (such as a card reader) addressed by 
the load switch. : 


Problems 441 


3. The execution of this start I/O instruction causes the first 24 columns of the first 
card to be read into the main storage, starting at location 0. The information punched 
into this card must be: 

Columns 1-8: the first PSW to be used after IPL is completed. 
Columns 9-16: the second CCW of the IPL command chain. 
Columns 17-24: the third CCW of the IPL command chain. 


4. The first CCW specifies command chaining: thus, the next CCW is fetched from the 
main storage location 8 (i.e., card column 9). Further operation is therefore already 
under control of the information just read in. (How the first program is read in 
depends on the channel program provided by the programmer.) 


5. The IPL is terminated as soon as a CCW no longer specifies command chaining. The 
PSW stored at location 0 of the main memory is loaded into the CPU. The system 
status is the same as specified in this PSW, and the first instruction is fetched from the 
location indicated by the instruction address in this PSW. 


References 


Pr 


9.2. 


‚ AMDAHL, С. M., “New Concepts in Computing System Design,” Proceedings of the IRE, 


May, 1962, pp. 1073-1077. 


. AMDAHL, С. M., BLAAuX, С. A, and Brooks, Е. P., JR., "Architecture of the IBM 


System/360,” ІВМ Journal of Research and Development, April, 1964, pp. 87-101. 


. ВТААОХ, G. S., Brooks, F. P., JR., STEVENS, W. Y., AMDAHL, G. M., and PADEGS, A., 


“The Structure of System/360,” IBM Systems Journal 3, Nos. 2 and 3, 1964, pp. 118-196. 


. FAGG, P., BRown, J. L., Hipp, J. A., Doopy D. T., FAIRCLOUGH, J. W., and GREENE, J., 


“IBM System/360 Engineering,, Proceedings of the FJCC, Spartan Book Co., 1964, pp. 
205-231. 

‚ “IBM System/360 Principles of Operation,” IBM Systems Reference Library, 
Form A22-6821-1, File No. 5360-01. 

, “IBM System/360 System Summary,” IBM Systems Reference Library, Form 
GA22-6810-10, File по. 5360-00. 


, “A Programmer’s Introduction to the IBM System/360 Architecture, Instructions, 
and Assembler Language,” IBM Corporation Form C20-1646-1. 


oblems 


. Conceive and describe a control organization which fetches the instructions with the 
formats having the three possible lengths as those shown in Fig. 9.4. 


Conceive and describe a control organization to implement the register and memory 
addressing in the fetch cycle for the instruction formats shown in Fig. 9.4. 


442 Chap.9 COMPUTER ORGANIZATION 


9.3. Conceive and describe an organization to exemplify the storage protection illustrated in 
Fig. 9.13. Make use of the formats described in Fig. 9.4. 


9.4. Revise the priority-interrupt organization described in statements (8.32)-(8.38) if the 
interrupt sources and their priority levels are those described in this chapter. 


9.5. Repeat Problem 9.4 if the PSW switching for the five types of interrupts is to be addi- 
tionally provided. ` 


9.6. Repeat Problem 9.5 if the setting of the interrupt code in the PSW is to be additionally 
provided. 


The I/O organization of а modern computer system aims at as much utilization of 
the main memory апа CPU as possible, while a single 1/0 operation or a multitude 
of 1/0 operations are being carried out. The difficulty in realizing this utilization is 
due to the great disparity between the data rates of the main memory and the 
I/O devices. For example, the main memory of a modern computer system may 
have a data rate of one million 64-bit words per second or more, while a keyboard 
generates only about ten 6-bit words per minute. Even the highest 1/О data rate 
is at about one million 8-bit bytes per second. Therefore, computing and processing 
in the CPU must proceed concurrently with |/O data transfers with as little inter- 
ference from the 1/0 operations as practically possible. The use of a channel as 
the I/O control can accomplish concurrency to a higher degree. Indeed, the 
advent of the channel has been one of the most important organizational advance- 
ments in modern computer systems. 

This chapter first introduces several 1/0 control organizations. It then 
describes, to some detail, the channel organization and operation of the IBM 
System/360 model 40 computer. The description includes the selector channel, 
the multiplexor channel, and the 1/0 interface. 


Channel Organization 1O 


10.1 1/0 Control Organization 


Input-output control organization in a computer system has greatly changed 
since the advent of the first electronic stored-program computer. This has been partly 
due to the use of computers for applications beyond the capacity of the main memory 
so that one or more large-capacity external memories must be employed as I/O devices 
and partly due to the wide disparity between the internal speed of a computer and 
the speed of the I/O devices. 

The problem of I/O control is essentially the problem of interface between the 
main memory and the I/O device. This interface attributes to different codes, different 
formats, different speeds, different manners of operation, and different kinds of hard- 
ware. The code interface is sometimes handled by hardware and sometimes by soft- 
ware, depending on the particular I/O device. The speed and format interfaces are 
commonly handled by using a buffer in which a memory word is assembled or dis- 
assembled. The operational interface is handled by using asynchronous operation and 
by “memory-cycle stealing” in order to keep the main memory and the CPU busy 
despite the relatively slow I/O data transfer. The hardware interface is due to different 
circuitry and different reliability as well as due to the need of attaching one or more 
I/O devices of different types to the computer system. 

Many I/O control organizations have been conceived and implemented. Their 
difference lies largely in the manner and in the degree of concurrent operation between 
the CPU and the I/O devices. This concurrent operation is most important in I/O 
control organization, because it greatly affects the efficient utilization of the computer 
system. 


10.1.1 Direct Program Control 


The direct program control commands and controls the transfer of one data 
unit from an I/O device by the execution of one I/O instruction. To transfer multiple 
units of data, multiple instructions are required. For example, the following instruc- 
tions read three words from an input device, 


Read x 
Copy а 
Copy b 
Copy c 


444 


бес. 10.1 1/0 Control Organization 445 


where the read instruction prepares to read one data unit on the I/O device specified 
by x and starts mechanical movement of the device if necessary. Each of the copy 
instructions stores the word arriving from the input device at memory addresses 
а, Б, ог с. 

Тһе above transfer occurs first from the input device to а register such as the 
MQ register in the CPU, and then to the main memory via the buffer register of the 
memory; this is shown in Fig. 10.1. The control unit of the CPU is used for I/O 
control. It performs the functions of selecting an I/O device, synchronizing the 1/O 
operation, transferring data to the main memory, and disconnecting the I/O device 


Main memory 


Buffer register 


Central 
control 


МО register 


Input-output 


I/O control 


distributor 


Card reader 


Line printer 


Magnetic tape memory 


Fig. 10.1 Direct program 1/0 control configuration 


446 Chap. 10 CHANNEL ORGANIZATION 


when the operation is completed. If the number of words to be transferred is also 
specified in the copy instruction, it is stored in a counter. For each word-transfer, 
the counter is decremented by one. The transfer is terminated when the counter 
reaches Zero. 

The above use of one copy instruction for transferring one word allows the 
transfer of records of different lengths by using a different number of copy instruc- 
tions. It also gives the advantage of allowing scatter-read of a tape record into several 
memory word-blocks. However, the multiple copy instructions are usually replaced 
by a short iteration loop containing one copy instruction. 

When a copy instruction is being executed, there is time to execute several other 
instructions because of the slow pace of the I/O device. For example, it may be 
possible to perform code conversion during the intervals which occur in card reading, 
card punching, or line printing. Nevertheless, the programmer has to be aware of 
the amount of time available, and this time depends on the type of the I/O device. 
When the incoming data arrives for another word transfer while the next copy instruc- 
tion is not being executed, the I/O device becomes disconnected and the data transfer 
is disrupted. Since preparation of I/O programs is not simple, they are usually pre- 
pared for use by system programmers. 

The direct program control is the simplest organization for I/O control. It was 
used for early digital computers such as the IBM 701 computer. It is still employed 
for the inexpensive digital computers of today. 


10.1.2 1/0 Data Buffering 


The use of the CPU for controlling I/O data transfers in the above direct program 
control is not economical because the CPU spends most of the time waiting for the 
I/O data. If a separate I/O register is provided for buffering the data, the CPU may 
execute other instructions of the program while waiting for the I/O data transfer. 
Thus, the employment of I/O registers allows for a degree of concurrent operation 
because the CPU can be utilized during waiting. 

For an efficient CPU operation, certain means have to be provided for the I/O 
device to signal to the CPU when the I/O data transfer is completed. For early com- 
puters, this was accomplished by a short iteration loop inserted at one or more places 
in the program being executed. Since 1960, most computers have been provided with 
an interrupt scheme; thus, the completion of an I/O operation can be indicated by 
an I/O interrupt signal. No program loop is required for such computers. 

The use of one I/O register for buffering the data is sometimes not effective 
because the time allowed for waiting is rather short. Instead of one register, a set of 
registers or a buffer memory is sometimes used. The buffer has its own control which 
assembles or disassembles the data and stores a block of words. When the buffer is 
full, its contents are transferred to the main memory atone time. 

Because of the larger capacity of the buffer, less disruption to the program 
execution occurs by the CPU. Thus, the use of a buffer allows better concurrent 
operations between the CPU and an I/O device. However, there are two disadvantages. 


Sec. 10.1 1/0 Control Organization 447 


The first is the limitation on block size imposed by the buffer; the block size cannot 
be larger than the buffer capacity unless more complex control is provided. The 
second is the cost of the buffer. The cost becomes multifold when concurrent opera- 
tions are required for several I/O devices, because one buffer is required for each I/O 
device in operation. 

An ingenious solution to eliminating the buffer has been found. It is sometimes 
called cycle-stealing. During a normal computer operation, the main memory is kept 
busy for program execution by the CPU in one memory cycle after another. When 
an I/O data transfer is ready, one memory cycle is “stolen” from its use by the CPU. 
The ГО data is transferred to or from the main memory during this memory cycle 
and the program execution is held up. At the end of this memory cycle, the CPU 
resumes its operation from where the cycle-steal occurred. The cycle-stealing opera- 
tion is different from the interrupt operation since the disruption is for the fixed short 
time interval and there is no need for saving register contents. 

The cycle-stealing is extraordinarily useful. The CPU wastes no time, except 
the time period during which the I/O data word is fetched or stored. The buffer needs 
to be only a register for one memory word. The relatively inexpensive memory loca- 
tions are actually also used as the buffer. Most important, the cycle-stealing permits 
buffering for a multitude of I/O devices. 


10.1.3 Data Channel 


In order to achieve a high degree of concurrent operation between the CPU and 
an I/O device, the data channel has been developed. The data channel is a processor 
specially designed for handling I/O operation. When the CPU executes an I/O instruc- 
tion, it initiates the channel and turns the control of the I/O operation over to 
the channel instead of carrying out the I/O operation itself. If there is more than one 
channel, more concurrent I/O operations can occur. Similar to the CPU, the data 
channel is capable of executing instructions, which are called channel commands (or 
simply commands). A sequence of these commands form a channel program which 
allows many versatile I/O read and write operations concurrent with the CPU opera- 
tion. Indeed, the advent of the data channel has been one of the most important 
organizational advancements in modern digital computer systems. 

The ІВМ data channel 7607 for the IBM 7090 family of computers is now used 
as an example. As many as eight data channels can be attached to the computer; 
thus, there can be as many as eight I/O devices operating concurrently with the CPU. 
As shown in Fig. 10.2, the channels and the CPU are connected to the main memory 
through a switching unit, called a multiplexor, for the purpose of switching the main- 
storage access between the CPU and the channels. 

A configuration of the data channel 7607 is shown in Fig. 10.3. Register DATA 
is the buffer register between the main storage and an I/O device; it serves as the 
buffer where one memory word is assembled or disassembled. Operation register OP 
specifies one of the eight commands. Channel address counter CAC specifies the 
number of words to be transmitted between the main storage and the I/O device. 


` "Channel А Channel B 


МО units 1/O units 


Fig. 10.2 Configuration of the IBM 7090 computer system 


Manual entry keys 


From multiplexor 
storage bus Data channel console 
Channel 
input switch 


Location counter 
switch 


me] D 
ENTRY EXIT TAPE Channel address 
switch 


From address 


register 
Card . i 
Printer Magnetic Address 
reader tape unit switch 


To memory 
address register 


+ 


Fig. 10.3 Configuration of the ІВМ data channel 7607 


448 


Sec. 10.1 1/0 Control Organization 449 


The location counter (LC), similar to the instruction counter, contains the location 
of the current data channel command plus one. Register TAPE is a 6-bit buffer for 
the magnetic tape device. The 36-bit registers ENTRY and EXIT are buffer registers 
for card reader and printer. They can be used in combination as a 72-bit register. 
The entry keys are those switches on the channel console for manual inputs. The 
address switch, the input switch, and the counter switch in Fig. 10.3 are switching 
units for providing the signal paths. This configuration is now described by the state- 
ments below. 


Comment, configuration of the IBM 7607 data channel. (10.1) 
Register, DATA(S,1-35), $data register 
OP(S,1,2,19), $operation register 
WC(3-19), $word counter 
CAC(3-17), $channel address counter 
LC(3-17), $location counter 
TAPE(S,1-35), $tape register 
EXIT(S, 1-35), $output register 
ENTRY(S, 1-35), $input register 
Switch, | ENTRYKEY(S,I-35(ON,OFF), Sentry keys on data channel 
console 


Data channel 7607, once started, operates asynchronously. When a single com- 
ment transmits a large block of words between the main storage and the I/O device, 
many instructions in the main program may be executed at the same time. 

Although a number of I/O devices may be connected to a data channel, only one 
can be in operation at one time, because the data channel can handle only the data 
transfer of one I/O device at one time. When there is more than one data channel, 
a priority scheme for the channels is required so that each channel can be serviced 
according to the desired order. 

Data channel 7607 may be initiated by the execution of two instructions in the 
CPU. As an example, to write a block of 50 memory words located at Y, ҮІ, 
..., X+49 onto the magnetic tape on unit 2 in channel A requires the following 
instructions and commands, 


WTBA 2 
RCHA JOE 


JOE IOCD  Y,50 


Instruction WTBA, write tape binary, selects tape unit 2 on channel A and specifies 


450 Chap. 10 CHANNEL ORGANIZATION 


the binary write operation. Instruction RCH, reset and load channel A, gives chan- 
nel A memory location JOE where the command JOCD is. Command IOCD, input- 
output under count control and disconnect, causes the block of words located at 
У, Y+1,..., Y--49 to be written on tape unit A2. When 50 words have been written, 
an end-of-record gap is written and the channel disconnects the tape unit. 


~ 


10.1.4 Multiplexor Channel 


As described above, one data channel can carry out one I/O operation concurrent 
with the CPU. Such a channel is sometimes called a selector channel. Selector 
channel allows I/O data transfer at a high data rate. It is uneconomical to have 
a data channel operating a slow I/O device. Thus, it is greatly desirable if the channel 
can operate concurrently with a multitude of slow I/O devices. Such a situation 
occurred in communication engineering before, and the solution has been the use of 
multiplexing in time. The concept of time-multiplexing the channel operation is 
illustrated in Fig. 10.4. As shown, there are six I/O control units; each unit controls 


Rotary I/O I/O 
switch control devices 
units 


Fig. 10.4 Multiplexing of a data channel 


an I/O device. These six control units timeshare the channel by means of a rotary 
switch. The channel initiates or handles I/O operation on one control unit after 
another in turn by means of the rotary.switch, becauše there is plenty of time in 
waiting for an I/O device to carry out an operation. Such a channel is called a 
multiplexor channel. 

The multiplexing scheme shown in Fig. 10.4 is not practical. It requires as many 


Sec. 10.1 1/0 Control Organization 451 


cables to connect the I/O control units as the number of I/O control units. A more 
practical idea is shown in Fig. 10.5, where there is only one cable connecting the I/O 


НО devices 
memory 1/О control 
units 


I/O interface 


Fig. 10.5 Configuration of a multiplexor channel 


control units to the channel. This cable, called the 7/O interface, contains, among 
other lines, address lines and a scan line. Each I/O control unit is assigned with an 
address. The channel sends out a signal on the scan line followed almost immediately 
by the signals on the address lines. When an I/O control unit is activated by the scan 
signal, it compares the incoming address with the assigned address. If the two 
addresses do not agree, the scan signal continues to the next I/O control unit until 
the control unit with the agreed address is found, or until the scan signal returns to 
the channel to indicate that the addressed I/O control unit is neither operational nor 
attached to the channel. 

The multiplexor channel actually consists of many subchannels. For example, 
there can be as many as 128 subchannels in the multiplexor channel of the IBM 
System/360 model 40 computer. Each subchannel contains all necessary information 
for the control of one I/O device. When several subchannels carry out I/O 
operations simultaneously, the multiplex channel is said to be in the multiplex 
mode of operation. In this mode, each I/O control unit is logically connected to the 
channel only for the time required to transfer one byte of data; for this reason, 
the multiplex mode is also called the byte mode. Between the transfer of two bytes, 
the I/O control unit is disconnected and the channel is free to operate other I/O 
control units. Therefore, the data transfers from several I/O units on the multiplex 
channel are interleaved. It is possible to have the channel operate only one I/O device 
at a high speed. In this case, the channel is said to be in burst mode. When the multi- 
plexor channel is in burst mode, it operates like a selector channel. The multiplexing 
of the channel is at the expense of reducing the data transfer rate of the channel. 

A multiplexor channel, as well as a selector channel, is capable of performing 


some or all of the following functions: 


1. Accepting an I/O instruction from the CPU. 


452 Chap. 10 CHANNEL ORGANIZATION 


Кә 


. Addressing and selecting the I/O device specified by the I/O instruction. 
. Fetching, decoding, testing, and executing channel control information. 
. Generating control signals to operate the I/O interface. 

. Accepting the status information and storing it in main storage. 

. Buffering the data Гог. transfer to or from the main storage. 

. Counting and parity-checking the data bytes. 


. Maintaining channel-status information. 


9 бо 8 с tA d с 


. Handling interrupt requests from the I/O devices. 
10. Signaling I/O interrupt to the CPU. 


10.1.5 1/0 Processors 


In order to make the CPU perform less I/O activity and thus perform more сот- 
puting and processing, I/O processors are used. The I/O processor is a stored program 
computer often operating independently of the CPU. The data channels are now 
connected to the I/O processors. In some cases, the I/O processor has adequate 
arithmetic, logical, branching, and buffering capabilities to execute I/O programs. 
In other cases, the I/O processor does not perform complex operations, but it calls 
on the CPU for such processing. The I/O processors, after manipulating the I/O 
data, store the data in the main memory and then notify the CPU by means of an 
interrupt. 

The CDC 6000 family of computers is now used as an example. In addition to 
a CPU and a main memory of 131,072 60-bit words, there are 10 I/O processors 
(called peripheral and control processors) and 12 I/O channels, as shown in Fig. 
10.6. Each peripheral processor, or simply PP, has a memory of 4,096 12-bit words 
and can access to any of the 12 channels. It is a simple computer that can operate 
independently of both the CPU and other PP’s. The CPU and all the PP’s can access 
to the main memory. Although the CPU is a very fast processor, it has no I/O capa- 
bility. If the CPU requires an I/O operation, it places the data in the main memory 
and sends a signal to a PP. The PP accepts the signal and the CPU then proceeds to 
other processing. If it is an output operation, the PP transfers the data into its memory 
and then to an I/O device via a channel. 

The 10 PP’s are all identical. Each has its own 4,096-word memory, program 
address register, accumulator, and control registers. Otherwise, the 10 PP’s time-share 
one processing unit. There are 64 instructions for add, subtract, logical, branch, 
input/output, the main-memory access, and the CPU access. A PP can detect which 
channels are inactive and choose one to transmit one or more words with one instruc- 
tion. When a PP initiates an I/O operation, it usually stays in a loop until the opera- 
tion is complete, because there is no interrupt hardware in the 6000 computers. 

There are 12 I/O channels. Each channel has a 12-bit channel register and two 
flags. There are 12 instructions available to direct activity on the I/O channels. Each 
channel has a transfer rate of one million 12-bit words per second. It can be connected 


Sec. 10.2 Channel Operation 453 


Main 
memory 


10 1/0 12 t/O channels 
processors 


Fig. 10.6 1/O processors of the CDC 6000 family of computers 


to one or more I/O devices. Only one device can communicate on one channel at one 
time, but all 12 channels can be active at the same time. 

The PP's may also be used as supervisory processors. For example, one PP, 
designated as the executive PP, contains the master control program. It schedules 
all jobs, including the jobs executed by the CPU. Another PP, designated as the 
monitor PP, monitors the program execution in the CPU. The remaining eight PP's 
are held in a pool. The executive PP selects one from the pool when the need arises, 
and instructs the PP to carry out the operation. 


10.2 Channel Operation 


The channel is a device for the control of the I/O device attached to a computer. 
It controls the transmission of data and control information between the I/O devices 
and the CPU, and also between the CPU and the main storage. In order to present 
some details of modern channels, the channel organizations of the IBM System/360 
model 40 are presented in the remaining sections of this chapter. 

There are five control words involved in the channel operation of the IBM 
System/360 computers: 


1. Channel address word, CAW 
2. Channel command word, CCW 
3. Channel status word, CSW 


454 Chap. 10 CHANNEL ORGANIZATION 


4. Unit control word, UCW 
5. Program status word, PSW 


This section describes these control words together with the channel commands, 
the channel program, the I/O instructions, and the I/O interrupts. 


10.2.1 Channel Address Word 


The channel address word (CAW) is a full word, permanently located at main 
storage location 72. The format is shown in Fig. 10.7. There are three fields, namely, 


0000 Command address 


0 3 4 78 31 
Fig. 10.7 Channel address word (CAW) format 


the key field, the zero field, and the command address field. The key field, bits (0-3), 
gives the I/O protection key which controls the access to the main memory during 
the I/O operation. The zero field, bits (4-7), must always contain zeros. The command 
address field, bits (8-31), specifies the main memory location of the first channel 
command word. 

The CAW must first be stored at main memory location 72 before the START 
I/O instruction is executed. When the START I/O instruction is executed, the САУУ 
word is fetched from the memory. The first channel command is then fetched from 
the command address in the CAW. 


10.2.2 Channel Command Word 


The channel command word (CCW) is a double word, stored in the main storage. 
The format is shown in Fig. 10.8. There are five fields: the command code field, the 
data address field, the flag field, the zero field, and the count field. The command 
code field, bits (0-7), specifies one of the six I/O commands listed on Table 10.1. 
The data address field, bits (8-31), specifies the first main storage location where the 
first word is stored in the cases of the read, read-backward, and sense operations. 
It specifies the first main storage location where the first word is taken in the cases of 
the write and control operations. The flag field, bits (32-36), contains five flags. The 
chain data flag CS (bit 32) when 1 indicates data chaining (see below). The chain 
command flag CC (bit 33) when 1 indicates command chaining (see below). The 
suppress length indication flag SLI (bit 34) when 1 suppresses the incorrect length 
indication when the channel detects an incorrect length record. The skip flag (bit 35) 
when | suppresses the data transfer from an input unit to the main storage while 
the external document is moving (such as skipping of card columns). This flag is used 
Гог a read, а read-backward, or a sense operation." The program control interrupt 


Sec. 10.2 Channel Operation 455 


Command 
re" | эзе | қ, | вю YY 


0 78 31 32 36 37 39 40 47 48 


Bits 0-7 Command code 
8-31 Data address 
32-36 Command flags 
32 Chain data flag 
33 Chain command flag 
34 Suppress length indication flag 
35 Skip flag 
36 Program-controlled interruption flag 
37-39 Zero 
40-47 Ignored 
48-63 Count 


Fig. 10.8 Channel command word (CCW) format 


TABLE 10.1 Channel Command Code 


COMMAND CODE 
Sense (when read forward) MMMM 0100 
Transfer-in-channel XXXX 1100 
Read-backward MMMM 1101 
Write (forward) MMMM ММО 1 
Read (forward) MMMM MM1 0 
Control MMMM MM 1 1 


Note: M indicates a modifier bit and X indicates an ignored bit. 


flag PCI (bit 36) when 1 generates an I/O interrupt to present status information to 
the CPU. Bits 37-39 must be zero unless the command is TIC. Bits 40-47 are ignored 
by the channel. The count field (bits 48—63) specifies the number of bytes to be trans- 
ferred. 


10.2.3 Channel Commands 


As shown in Table 10.1, there are six I/O commands: read, write, control, read- 
backward, sense, and transfer-in-channel (TIC). In a write operation, the channel 
fetches words from the main storage and transfers one byte at a time to the I/O device. 
In a read or a read-backward operation, the channel accepts one byte at a time from 
the I/O device and transfers one word at a time to the main storage. In a read opera- 
tion, words are stored in main storage in the ascending order of addresses. In a read- 
backward operation, words are stored in the main storage in the descending order of 
addresses. In a sense operation, the addressed I/O device transfers the current status 
of the 1/O device and unusual conditions across the interface. In a transfer-in-channel 
operation, the CCW from the location specified by the data address field of this 


456 Chap. 10 CHANNEL ORGANIZATION 


command is fetched instead of the next sequential CCW. It thus causes a branch 
from one CCW sequence to another. 


10.2.4 Channel Program 


A channel program specifies the operations by one I/O instruction. It contains 
a series of one or more CCW's. If it contains more then one CCW, the CCW’s are 
said to be chained. There are two types of chaining: command chaining and data 
chaining. Command chaining refers to a number of CCW’s with different commands, 
while data chaining refers to a number of CCW’s with the same command but with 
different data areas. 

With command chaining, the channel uses the new CCW to perform a new 
operation on the device and thus permits the CPU to initiate such sequences as 
printing multiple lines or reading multiple tape blocks with a single I/O instruction. 
Command chaining allows the channel to execute a channel program with a number 
of I/O operations. 

With data chaining, the channel uses the new CCW to designate another data 
area for the original I/O operations; the I/O device continues to execute this opera- 
tion. Thus, data chaining permits the reorganization of data as it is transferred 
between the main storage and the I/O device. 

Three simple channel programs are now shown. The first example consists of 
only one CCW as follows: 


Command Data Address Flags Count 
(hex.) (binary) (decimal) 
Read 1000 00000 200 


The above channel program reads one complete record of 200 bytes from a magnetic 
tape drive (specified by the modifier bits in the command code) into the main memory 
Jocations from 1000 hex. to 10C8 hex. 

The second example consists of the following three CCW’s. 


CCW Command Data Address Flags Count 
(hex.) (binary) (decimal) 
Read 1000 10000 50 
2 Ignored Ignored 10010 20 
Ignored 2000 00000 130 


The above channel program reads a 200-byte record from a magnetic tape drive 
(specified by the modifier bits in the command code) with bytes 1-50 in the main 
memory at location 1000 hex., with bytes 51-70 ignored, and with bytes 71-200 into 
the main memory at location 2000 hex. The above read operation is carried out with 
data chaining since the CD flag in the first and the second CCW is 1. 


Sec. 10.2 Channel Operation 457 


The third example consists of the following three CCW’s: 


CCW Command Data Address Flags Count 
(hex.) (binary) (decimal) 
1 Write 1000 01000 50 
2 Control Ignored 01100 1 
3 Control Ignored 00100 1 


The above channel program writes a 50-byte trailer label on the magnetic tape (speci- 
fied by the modifier bits in the command code) starting from main memory location 
1000 hex., writes tape mark (specified by the modifier bits in the command code), 
and rewinds and unloads the tape (specified by the modifier bits in the command 
code). The above write operation is carried out by command chaining since the CC 
flag in the first and the second CCW is 1. Since the initial count of a CCW may 
never be zero, an initial nonzero count has to be specified in control commands that 
do not transfer control data, and the SLI flag is set to suppress an incorrect-length 
indication. 

With a proper channel program, it is possible to perform the following types of 
I/O data transfers and operations: scatter-read (reading one physical record into 
multiple, noncontiguous areas of the memory), extraction (reading only selected 
portions of a record into the memory), nondata I/O operations (such as backspace 
and rewind), and a sequence of operations on the same device (such as reading over 
an interrecord gap). 


10.2.5 Channel Status Word 


The status of the channel, the I/O control unit, the I/O device, and the location 
of the next CCW are indicated by the channel status word CSW. It is a double word, 
stored at main storage location 64. The CSW is updated after the completion of an 
I/O operation to provide information about its termination. 

The CSW format is shown in Fig. 10.9. There are five fields: the key field, the 
zero field, the command address field, the status field, and the count field. The key 
field (bits 0-3) contains the protection key for the channel. The zero field (bits 4-7) 
is always zero. Тһе command address field (bits 8-31) stores the address of the next 
CCW. The status field (bits 32-39) stores the status of the I/O control unit and I/O 
device, and the status field (bits 40-47) stores the status of the channel. The count 
field (bits 48-63) indicates the residual count after the termination of the I/O opera- 
tion. 

The status bits 32-39 are designated as follows. Bit 32 (attention) is generated 
when a manual operation is initiated through such a device as the console typewriter. 
Bit 33 (status modifier) indicates that the I/O unit cannot execute the command. 
Bit 34 (control unit end) indicates that the control unit is now free. Bit 35 (busy) 
indicates that the device is busy. Bit 36 (channel end) is always generated when the 


458 Chap. 10 CHANNEL ORGANIZATION 


0 34 78 31 32 47 48 63 


Bits 0-3 Protection key 
7427 Zero 
8-31 Command address 
32-47 Status 
32 Attention 
33 Status modifier 
34 Control unit end 
35 Busy 
36 Channel end 
37 Device end 
38 Unit check 
39 Unit exception 


40 Program-controlled interruption 
41 Incorrect length 

42 Program check 

43 Protection check 

44 Channel data check 

45 Channel control check 

46 Interface control check 

47 Chaining check 

48-63 Count 


Fig. 10.9 Channel status word format 


I/O control unit no longer needs the channel and is able to complete the I/O opera- 
tion on its own. Bit 37 (device end) is always generated when the device reaches its 
mechanical ending point or when the device is switched from the not-ready to the 
ready status of, for example, a tape drive. Bit 38 (unit check) is usually generated 
when the control unit or the device has detected machine malfunction. Bit 39 (unit 
exception) is generated when the I/O device detects a condition that does not normally 
occur, such as recognition of a tape mark. 

The status bits 40-47 are designated as follows. Bit 40 (program controlled 
interrupt PCT) is generated when the CSW is updated because of the PCI flag in 
the CCW's. Bit 41 (incorrect length) is generated if the count of the CCW is not 
zero at the end of an I/O operation and the SLI flag is 0. Bit 42 (program check) 
indicates an invalid condition such as invalid CCW address or invalid command key. 
Bit 43 (protection check) is generated if, during the I/O operation, an attempt to 
violate main memory protection is made. Bit 44 (channel data check) indicates the 
detection of invalid data by the channel. Bit 45 (channel control check) is generated 
by any machine malfunction affecting channel controls. Bit 46 (interface control 
check) indicates invalid signals on the interface. Bit 47 (chaining check) indicates 
over-run conditions during data chaining on input operations. 


10.2.6 Unit Control Words 


АП information necessary to control an I/O operation on a channel is held in 
a unit control word (UCW). The general composition of a UCW is shown in Table 


бес. 10.2 Channel Operation 459 


TABLE 10.2 Composition of Unit Control Word 


SOURCE DATA 
I/O instruction Device address 
CAW CCW address, protection key 
CCW Command code, flags, count, data address 
Channel Channel status 
I/O device Device status 


10.2. The number and the format of the UCW are different for the selector and 
multiplexor channels. These UCW’s will be described later. 


10.2.7 Program Status Word 


Only parts of the PSW are involved in the channel operation. They are the channel 
masks in bits 0 to 2, the interrupt code in bits 16 to 31, and the condition code in 
bits 34 to 35. The channel masks make possible prevention of all I/O interrupts from 
occurring; they usually allow I/O interrupts from one channel, while preventing I/O 
interrupts on another channel. 


TABLE 10.3 Designation of the PSW Condition Code 
for І/О Operations 


CONDITION 
CODE DESIGNATION 
fe Е ME M aia 
Start ПО 
00 I/O operation successfully initiated 
01 Status part of the CSW is stored 
10 Channel or subchannel is busy 
11 United not operational 
Halt ГО 
00 The channel or subchannel was not working 
01 Exceptional condition, status portion of CSW is stored 
10 Operation terminated 
11 Unit not operational 
Test 1/0 
00 Channel, unit, and device available 
01 CSW storedt 
10 Channel or subchannel is busy 
11 Unit not operational 
Test Channel 
00 Channel available 
01 Interruption pending in channel 
10 Channel or subchannel is busy 
00 Unit not operational 


Ел QU MM M шшш cn C A ccc! ee 
+Interrupt cleared or exceptional condition during test I/O detected. 


460 Сһар. 10 CHANNEL ORGANIZATION 


When the CPU accepts an I/O interrupt, the interrupt code gives the channel- 
unit address which caused the interrupt. The interrupt code is stored in the old PSW. 
The condition code is set according to the values in stats Y2 and Y3 during the 
execution of an I/O instruction. The designation of the condition code for the four 
ГО instructions is shown in Table 10.3. 


10.2.8 1/0 Instructions 


There are four I/O instructions used by the CPU to communicate with a channel: 
(a) start I/O, (b) test I/O, (с) halt I/O, and (d) test channel. These instructions are 
in the SI format as shown previously in Fig. 9.4. Bits 8-15 are ignored. The В field 
(bits 16-19) designates a register. The sum of the contents of the B register and the 
D field (bits 20-31) identifies the channel, the subchannel, and the 1/О device. Bits 
21-31 of this sum constitute the 11-bit I/O address with bits 0-20 ignored. Of the 
11-bit I/O address, bits 21-23 give a channel address while bits 24-31 give a device 
address and identify the device on the channel and, in the case of the multiplexor 
channel, the subchannel. 

The start I/O instruction initiates a write, read, read-backward, control, or sense 
operation at the addressed I/O device, the channel, or the subchannel. The CAW at 
main storage location 72 contains the protection key for the channel or subchannel 
and the address of the first CCW. The CCW specifies the operation, the area in the 
main storage to be used, and the action to be taken when the operation is completed. 
Execution of this instruction also sets the condition code in the PSW to indicate the 
status at the termination of the operation. 

The halt I/O instruction terminates the current I/O operation at the addressed 
subchannel or channel and sets the PSW condition code to indicate the cause. The 
test I/O instruction sets the PSW condition code to indicate the state of the addressed 
channel, subchannel, and device, and under certain conditions to store the CSW. 
The test channel instruction sets the PSW condition code to indicate the state of the 
addressed channel; the state of the addressed channel is not affected. Designation of 
the PSW condition code in the PSW for these four I/O instructions is shown in Table 
10.3. 


10.2.9 1/0 Interrupts 


I/O interrupts provide a means for the CPU to change its state in response to 
the interrupt conditions occurring in I/O devices or channels. The interrupt condition 
can be initiated by a device or by a channel. The conditions that can be initiated by 
the device and their types are: 


1. Channel-end (end type) 
2. Unit check (end type) 
3. Unit exception (end type) 


Sec. 10.2 Channel Operation 461 


4. Device-end (device end type) 
5. Control-unit-end (device end type) 


6. Attention (device end type) 
The conditions that can be initiated by the channel are: 


1. Program check (end type) 
2. Program-controlled-interrupt or PCI (end type) 


Every I/O operation initiated by the start I/O instruction generates the channel-end 
condition and the device-end condition. The channel-end condition is generated when- 
ever the control unit no longer needs the interface facilities to complete a command. 
The device-end condition is generated when the device currently operating reaches 
its normal mechanical ending point. The control-unit-end condition is generated to 
indicate that the control unit is now available. This condition should come from the 
control unit that was previously addressed by a START I/O and a test I/O instruction, 
and was busy at that time. The attention condition is generated by such devices as 
console typewriters and display consoles when a manual inquiry is made. The pro- 
gram-control-interrupt condition (РСТ) is generated when а PCI flag is present in a 
CCW. Unless the channel-end condition occurs before the PCI interrupt is taken, 
the PCI flag is propagated through the following CCW’s until the interrupt is taken. 

As shown above, there are two types of interrupt conditions. In an end type 
interrupt, a complete CSW is stored in the main storage, and the unit status is stored 
in the UCW in the subchannel. Thus, the information necessary to form the CSW 
is found in the subchannel, and the unit does not have to be selected to obtain the 
status. In a device end type interrupt, the status of the device is stacked at the device; 
the device is selected so that this status is obtained and then stored in the CSW. In 
short, an end-type interrupt has the status available in the channel, while a device end 
type interrupt has the status stacked in the device. 

As an illustration, Fig. 10.10 shows a greatly simplified flowchart of the multi- 
plexor channel interrupt micro-subroutine or microprogram (mpgm). The PRI in 
the figure represents a program interrupt condition that is set by external interrupts, 
I/O interrupts, and update timer interrupt. As shown there, the PRI condition is 
tested at the beginning of every instruction fetch. When the PRI condition is 1, the 
instruction counter is stored, and the type of interrupt is determined. If it is an external 
interrupt or an update timer interrupt, it is transferred to the external interrupt 
micro-subroutine or the update-timer interrupt micro-subroutine, respectively. If it is 
an 1/О interrupt, the particular channel is determined. If the interrupt is for a multi- 
plexor channel, it is then determined whether the interrupt is a device-end type or 
an end-type. If it is an end-type (bit 5 of byte 0 of the interrupt buffer being O as will 
be shown), the complete CSW is stored. If the interrupt is due to the end type but 
not PCI, the PCI flags and op-codes in UCW are set to 0. If it is due to the end type 
and PCI, only the PCI flags are set to 0. Ifit is a device-end type, the device is selected 
by using the test 1/О command and its status is accepted to be used later asa part 
of the CSW. The stacked status is cleared from the device. The CSW which contains 


462 Chap. 10 CHANNEL ORGANIZATION 


Entry 


Fetch an 
instruction 
= No Continue CPU 
Yes ` 


Store instr. 
counter 
Update 
External External пи timer Update timer 
interrupt subr. say interrupt subr. 


КО interrupt 


SCH1 interrupt Which channel? SCH2 interrupt 
subroutine SCH1 - SCH2 subrouti ne 
MCH 
No Device-end type 
Yes 


Select device 
Accept status 


PCI flags 


Zero flags and 
op-codes in UCW 


Store 
complete CSW 


Zero PCI 
flags 


Put CH-unit 
status in CSW 


Store PSW 
mpgm 


Fig. 10.10 Flowchart showing multiplexor channel interrupt 


only unit status and channel status is stored in the main storage. For both types, 
this interrupt microprogram leads to the store-PSW microprogram. 


10.3 1/0 Interface 


The I/O interface refers to those lines which serve as a communications link 
between the channel and the I/O control units (and in turn the 1/O devices that are 


Sec. 10.3 1/0 Interface 463 


attached to each I/O control unit). Different I/O devices such as keyboards, card 
readers, printers, drums, discs, and terminals require different control units. The 
control unit decodes the commands received from the channel, interprets them for 
the I/O device, and provides the signal sequence for executing the operation. Because 
the I/O control units are different, the interface between the I/O control unit and 
its I/O devices has to be different. However, it has become a regular practice for a 
family of computers to have an identical interface between the channel and its attached 
I/O control units. Such an interface, called the standard I/O interface, provides a 
uniform method of attaching I/O control units to the channel and allows greater 
ease and economy in connecting the computer system to various I/O devices. 

In this section, the standard I/O interface for the IBM System/360 is described. 
This interface has common formats and signal sequences to all the attached control 
units. Except for signals used to establish the selection control, all communications 
to and from the channel occur over a “common bus.” At any one time, only one 
control unit can be logically connected to the channel. The selection of the control 
unit to communicate with the channel is controlled by the signal on a line passing 
through all control units. This signal, called the select-out, permits sequential response 
of each control unit to the signals provided by the channel. A control unit remains 
logically connected to the interface until it transfers the information it needs or has, 
or until the channel signals it to disconnect. The priority of the I/O control units 
attached to the channel depends on the order of proximity of the I/O control units 
to the channel. 


10.3.1 1/0 Interface in а Multisystem 


As defined by Blaauw [7], a multisystem is one that has two or more CPU’s 
capable of communicating with each other without manual intervention. The I/O 
interfaces of the multisystem shown in Fig. 10.11 are now used as an example. In this 
multisystem, there are four standard I/O interfaces A, B, C, and D connecting the 
two CPU’s to a multitude of I/O devices. Interface A connects channel A to the 
multidevice control units X, Y, Z, and the shared multidevice control unit T. Interface 
D connects channel D to the multidevice control unit R, the integrated control units 
(i.e., the control unit is an integral part of two I/O devices), and the shared multidevice 
control unit T. The connection of the shared multidevice control unit by interfaces 
A and D allows the access of the two attached 1/О devices by CPU 1 via channel A 
or by CPU 2 via channel D. Interface B connects channel B to the channel-to-channel 
adapter and the multidevice control unit W. Similarly, interface C connects channel 
C to the channel-to-channel adapter and the multidevice control unit V. A channel- 
to-channel adapter permits connection of the I/O interfaces of two channels and makes 
each channel appear as a control unit to the other channel; the two CPU’s are thus 
interconnected. Furthermore, four I/O devices are connected to a shared switch which 
in turn is attached to the multidevice control units R, V, W, and X; this organization 
allows the four I/O devices to be shared by the four control units and in turn allows 
them to be accessible by the two CPU’s via any of the four interfaces. 


464 Сһар.10 CHANNEL ORGANIZATION 


Interface В--- «—— Interface С 
Channel-to-channel 


adapter Interface D 


aunt Interface A 


мосом | MDCU V 


Shared switch 


CPU = Central processing unit CH = Channel 
SDCU = Single device control unit D  -1/O device 
MDCU = Multi device control unit ICU = Integrated control unit 


Fig. 10.11 1/0 interfaces of a multisystem 


10.3.2 Interface Lines 


The standard I/O interface consists of 34 lines. They can be divided into seven 
groups: input bus, output bus, inbound tags, outbound tags, scan controls, inter- 
locks, and special controls. These lines are-shown іп Fig. 10.12. They are declared by 
the following terminal statements. 


Comment, declaration of the 34 interface lines ` (10.2) 


Зес. 10.3 


Comment, two buses 


Terminal, BUS-IN(P,0-7), $8 data lines and 1 parity line 


Terminal, 


I/O Interface 


Channel 


Inbound tags (3 lines) 
Address-in 

Status-in 

Service- in 

Bus-out (9 lines) 


Bus-out bit positions 
P,01,2,3,4,5,6,7 


Bus-in (9 lines) 


Bus-in bit positions 
Р, 0, 1, 2, 3, 4, 5, 6, 7 


Outbound tags (3 lines) 


Address-out 1/0 
Command-out Control 
Service-out unit 


Scan controls (4 lines) 


Select-out 
Hold-out 
Select-in 
Request-in 


Intertocks (2 lines) 


Operational-out 
Operational-in 


Special controls (4 lines) 


Suppress-out 
Clock-out 
Metering-out 
Metering-in 


Fig. 10.12 The 34 lines of the standard 1/0 interface for the IBM 


System/360 


BUS-OUT(P,0-7) 58 data lines and 1 parity line 


Comment, three inbound and three outbound tag lines 


ADR-IN, 
STA-IN, 
SRV-IN, 


Saddress-in line 
$status-in line 


$service-in line 


465 


466 Chap. 10 CHANNEL ORGANIZATION 


ADR-OUT, $address-out line 
CMD-OUT, $command-out line 
SRV-OUT $service-out line 

Comment, four scan control lines 

Terminal, SEL-IN, ` $select-in line 
REQ-IN, $request-in line 
SEL-OUT, $select-out line 
HLD-OUT, $hold-out line 

Comment, two interlock lines 

Terminal, OP-IN, $operation-in line 
OP-OUT, $Soperation-out line 

Comment, four special control lines 

Terminal, MTR-IN, $metering-in line 
MTR-OUT, $metering-out line 
SUP-OUT, $suppress-out line 
CLK-OUT, $clock-out line 


The above BUS-OUT lines are used to transmit data, addresses, commands, or 
control orders from the channel to the control unit; the BUS-IN lines to transmit 
data, device identification, status information, and sense data from the control unit 
to the channel. The tag lines are used for interlocking and indicating information on 
the buses and for special sequences. The scan control is used for the scanning or 
selection of attached I/O devices. The special control lines are used for the usage 
meters located in the various attached units. 


10.3.3 Sequence Controls 


The interface lines connect all the control units attached to the channel. Signaling 
one or more of these lines forms a sequence to convey control information. А number 
of such sequence controls shown in Fig. 10.13 are explained below. 

The selection of a control unit is achieved by lines SEL-OUT and SEL-IN. 
They form a loop by having line SEL-OUT connect the channel to the control units 
one after another in sequence, and then having line SEL-IN provide a return path 
from the last control unit to the channel. When the channel raises line SEL-OUT, 
the signal on line SEL-OUT interrogates the control units one after another according 
to its proximity to the channel; this is called polling. The presence of a signal on line 
SEL-IN after raising line SEL-OUT notifies the channel that no control unit on the 
interface has requested service, as shown in Fig. 10.13(a). 

Line ADR-OUT is used to signal all control units to decode the I/O device 


(a) Polling 


(b) Device selection 


(c) Service request 


(d) Accept address 


(e) Accept status 


(f) Accept data 


(g) Data ready 


(h) Command ready 


(i) Stop 


ch Select-out 


Select-in с 


ЕЗ 


Address-out 
Bus-out 


ch 


Operation-in 


о 
с 


Request-in 
Select-out 
Address-in 


- cu 
Bus-in 


| 


о 
т 


Command-out 


Status-in 
Bus-in cu 
ch Service-out 
Service-in 
Bus-in cu 
ch Service-out 


Service-in 


Service-out 
Bus-out 


i ШІ ee 


Command-out 
Bus-out 


о 
в 


Service-in 


ch - Command-out 


Fig. 10.13 (Part 1) Interface sequence controls 


467 


(j) Proceed Address-in 
Command-out 


Status-in 
(K) Stack status 
Command-out 


ch Suppress-out 


Select-out 


(1) Suppress status 


ch Select-out 
(m) Suppress data 


ch Suppress-out 


. Status-in cu 
(n) Chain command 
Service-out 
ch Suppress-out 


T Suppress-out 
(0) Selective reset Operation-out (down) 


Operation-out (down) 


en Suppress-out (down) 


| 


(р) System reset 


Address-out 
Select-out (down) 


Operation-in (down) 


Address-out (down) 
Select-out (down) 


о 
в 


(а) Interface disconnect 


ch = Channel, си = Control unit 


Fig. 10.13 (Part 2) 


468 


Sec. 10.3 1/0 Interface 469 


address on line BUS-OUT; the control unit, when recognizing the address, must 
respond by raising line OP-IN to indicate that the device is selected, as shown in 
Fig. 10.13(b). 

A control unit raises line КЕО-ІМ to indicate to the channel that it needs service. 
The channel responds by raising line SEL-OUT, as shown in Fig. 10.13(c). 

Line ADR-IN is used by the control unit to signal to the channel when the 
address of the currently selected I/O device has been placed on line BUS-IN; the 
channel responds by line CMD-OUT to indicate that the address is accepted, as 
shown in Fig. 10.13(d). 

Line STA-IN is used to signal the channel by the control unit when the selected 
control unit has placed status information on line BUS-IN; the channel responds by 
line SRV-OUT to indicate that the channel has accepted the status information, as 
shown in Fig. 10.13(e). 

Line SRV-IN is used by the control unit to signal to the channel when the 
selected I/O device is ready to transmit or receive a byte of information. If the channel 
signals line SRV-OUT in response to line SRV-IN during a read, a read-backward, 
or a sense operation, it indicates that the information placed on line BUS-IN has 
been accepted by the channel, as shown in Fig. 10.13(f). If this happens during a 
write or a control operation, it indicates that the requested information placed on 
line BUS-OUT is ready for acceptance by the control unit, as shown in Fig. 10.13(g). 

The basic function of line CMD-OUT is to indicate to a control unit that the 
information on line BUS-OUT is a command, as shown in Fig. 10.13(h). If line 
CMD-OUT is sent in response to line SRV-IN, it indicates stop (i.e., the channel 
is ending the current operation), as shown in Fig. 10.13(i). If line CMD-OUT is sent 
in response to line ADR-IN, it indicates to the control unit to proceed, as shown 
in Fig. 10.13(j). If CMD-OUT is sent in response to STA-IN, the channel requests 
the control unit to stack status at the control unit or I/O device, as shown in Fig. 
10.13(k). 

The functions of SUP-OUT are: selective reset, suppress status, suppress data 
transfer, and chain command. Selective select will be shown in Fig. 10.13(0). When 
SEL-OUT rises at a control unit holding stacked status data, that control unit 
will not capture the interface to present the status information if SUP-OUT is 
active; thus, status information is suppressed as shown in Fig. 10.13(1). When a 
channel raises SUP-OUT before the fall of SEL-OUT, OP-IN, and REQ-IN 
is blocked from a control unit, the data transfer is suppressed as shown in Fig. 
10.13(m). When the channel, in response to STA-IN from a control unit, signals 
SUP-OUT and SRV-OUT, the selected control unit and 1/O device are held, and 
the next command from the channel is directed to that control unit and I/O device, 
as shown in Fig. 10.13(n). 

OP-OUT enables all control units on the interface to communicate with the 
channel. 1f OP-OUT is down and SUP-OUT is up while a control unit is operating 
on the interface, the control unit is reset, as shown in Fig. 10.13(o). If both OP-OUT 
and SUP-OUT are down, all control units on the interface are reset, as shown in 


Fig. 10.13(p). 


470 Сһар.10 CHANNEL ORGANIZATION 


When ADR-OUT is up and SEL-OUT is down before the completion of any 
signal sequence, the control unit will recognize the signal interface-disconnect and 
respond by dropping line OP-IN, while the channel will in turn respond by dropping 
line ADR-OUT, as shown in Fig. 10.13(q). 

The channel signals line HLD-OUT to all control units in parallel to allow the 
channel to cancel the effect of line SEL-OUT at the chosen control unit. As a result, 
line SEL-OUT produces no effect on the control unit. CLK-OUT indicates to the 
control unit that the CPU is not in a halt or wait condition. Line MTR-OUT enables 
the usage meter at the control unit to record time. Line MTR-IN is used to indicate 
to the channel that the control unit is recording time. 


10.3.4 Address, Command, Status, and Sense Bytes 


The byte of information on lines bus-out can be an address or a command, 
while that on lines bus-in can be an address, status information, or sense information. 
Therefore, they are to be referred to as address byte, command byte, status byte, 
and sense byte. 

When a control unit is installed, a unique 8-bit I/O device address is assigned to 
each I/O device attached to the control unit. This address byte is used over the inter- 
face for direct addressing of the attached I/O devices. Furthermore, the control unit 
is assigned a set of address bytes. It must respond to those addresses in the set which 
are ready, or which are not ready but can be made ready by means of an ordinary 
manual intervention. 

The I/O operation to be executed over the interface is determined by the command 
byte issued by the channel to the I/O device during a channel-initiated selection se- 
quence. The designation of the command byte is shown in Table 10.4. As shown 
there, the low-order bit positions specify the type of operations. The higher-order 
bit positions are modifier bits for use by the control unit and the I/O device. Thus, 


TABLE 10.4 Designation of the Command Byte on the 


Interface! 
Вт POSITION 
COMMAND 

P 0 1 2 3 4 5 6 7 
Test I/O 1 0 0 0 0 0 0 0 0 
Sense P M M M M 0 1 0 0 
Read-back ward P M M M M 1 1 0 0 
Write P M M M M M M 0 1 
Read P M M M M M M 1 0 
Control P M M M M M M 1 1 
Basic sense 0 0 0 0 0 0 . 0 0 0 
Basic read 0 0 0 70 0 0 0 1 0 
No-op control 1 0 0 0 0 0 0 1 1 


tM = modiffer bit, P = parity bit 


Sec. 10.3 1/0 Interface 471 


the exact values of the modifier bits depend on the particular I/O device. Note that, 
in Table 10.4, commands basic sense, basic read, and no-op control are commands 
sense, read, and control, respectively, with their modifier bits being 0. 

The read command initiates the execution of data transfer from the control unit 
to the channel and the data is obtained, for example, from the record source of the 
particular I/O device in operation. The read-backward command initiates an opera- 
tion in the same manner as the read command, except that the data bytes are trans- 
ferred to main storage by the channel in the reverse order to that for read command. 
The write command initiates the data transfer from the channel to the control unit 
which decodes the modifier bits and performs the required operations. The sense 
command proceeds exactly as a read command, except that the data is a sense byte. 
The test I/O command receives the status byte from the addressed I/O device. A basic 
read command is a read command with zero modifier bits; it is also used as a read 
command in the initial program loading (IPL). The basic sense command initiates 
a sense operation on all I/O devices. The no-op control command performs по opera- 
tion at the I/O device, except to satisfy any previously indicated chaining operations 
and to allow certain I/O devices to wait for conditions of checking before releasing 
the channel. 


TABLE 10.5 Designation of the Status Byte 


Віт POSITION DESIGNATION 
P Parity 
0 Attention 
1 Status modifier 
2 Control unit end 
3 Busy 
4 Channel end 
5 Device end 
6 Unit check 
7 Unit exception 


The designation of the status byte is shown in Table 10.5. The status byte is 
transmitted to the channel in the following six situations: 


1. During the initial-selection sequence 

2. To present the channel-end status at the termination of data transfer 

3. To present the device-end signal and any associated conditions to the channel 
4 


. To present control-unit-end or device-end status which signals that the control unit 
or device is now free 
5. To present any previously stacked status when allowed to do so 


6. To present any externally initiated status to the channel 


Data transfer during a sense operation provides information concerning unusual 


472 Chap. 10 CHANNEL ORGANIZATION 


conditions detected in the last operation and concerning the actual state of the I/O 
device. All sense information is normally provided in the first two bytes. Any bit 
positions that follow those used for programming information contain diagnostic 
information, which may extend to as many bytes as needed. For most I/O devices, 
the first six bits of the first sense byte are common to all I/O devices; their designation 
is shown in Table 10.6. ~ 


TABLE 10.6 Designation of the First Sense Byte 


Brr POSITION DESIGNATION 
0 Command reject 
1 Intervention required 
2 Bus-out check 
3 Equipment check 
4 Data check 
5 Overrun 


10.3.5 Interface Sequences 


The communication on the I/O interface is carried out by various signal sequences 
through the interface. A byte-mode operation on one subchannel of the multiplexor 
channel is now chosen as an example. The flowchart of the byte-mode operation is 
shown in Fig. 10.14. As shown there, the multiplexor channel starts by performing 
an initial selection sequence to select the I/O device. After the device is selected, the 
data is transferred by activating a data transmission sequence. When the data transfer 
is completed, an ending sequence is carried out. If there are more than one I/O 
operation, the multiplexor channel performs these operations in turn, according to 
the priority of the active control units. In order to illustrate the operations of interface 
sequences, the initial selection sequence, the byte-mode data transmission sequence, 
and the ending sequence are now exemplified. 

The initial selection sequence enables the channel to select an I/O device and to 
specify a device operation. It begins when the channel transmits an address byte to 
the interface. Each control unit attached to the interface attempts to decode the 
address, but only one control unit can interpret the address byte. The selected control 
unit responds by either raising line STA-IN to indicate that the unit is busy or raising 
line OP-IN to indicate that the device will complete the initial selection sequence. 
If no control unit decodes the address byte, the control unit with the lowest priority 
propagates the signal on line SEL-IN back to the channel; this signal causes the 
channel to drop line ADR-OUT and terminate the initial selection sequence. When 
the channel receives the signal from line OP-IN, it responds by dropping ADR-OUT. 
The selected control unit then transmits the address byte to the channel. The channel 
now compares this address to the one placed earlier on line BUS-OUT. If the 
addresses agree, the channel transmits a command byte to the control unit, which 
responds by transmitting a status byte. If the I/O device is available, the status byte 


Sec. 10.3 1/0 Interface 473 


Entry 


Initial 
selection 
sequence 


Data 
transmission 
sequence 


Ending 
sequence 


Exit 


Fig. 10.14 A byte-mode operation on one subchannel of the 
multiplexor channel 


contains zero. If the channel accepts this status byte, it terminates the initial selection 
sequence by raising line SRV-OUT. 

Assume that register ISS issues the control signal for starting the initial selection 
sequence and that the selection is successful. The signal sequence on the interface is 
described below. 


Comment, description of the initial selection sequence (10.3) 
/ISS/ ADR-OUT —1, BUS-OUT<“out-address byte,” 
/AISS/ SEL-OUT--1, $A is a delay 


Comment, the selected control unit responds 
/SEL-OUT/ OP-IN<-1, 
/OP-IN/ ADR-OUT -—0, 
Comment, the selected control unit transmits the address byte 
/ADR-OUT’/ BUS-IN-—"in-address byte,” ADR-IN<1, 
Comment, the channel compares addresses and, if they agree, transmits com- 
mand byte 
/ADR-IN/ IF (in-address=out-address) THEN 
(BUS-OUT-—"command byte,” CMD-OUT--1, SEL-OUT 
<—0), 


474 


Chap. 10 CHANNEL ORGANIZATION 


Comment, the control unit drops address in signal 

/CMD-OUT/ ADR-IN<0, 

Comment, the channel drops command out signal 

/ADR-IN'/ CMD-OUT —0, 

Comment, the control unit transmits the status byte 

/CMD-OUT'/ BUS-IN —"status byte," STA-IN-—1, 

Comment, the channel and control unit terminate the initial selection sequence 
/STA-IN/ SVR-OUT<1, 

/SRV-OUT/ 5ТА-ІМе-0, OP-IN-—0, 

/STA-IN’/ SRV-OUT —0, 


The byte-mode data transmission sequence may occur as follows. The control 


unit raises line REQ-IN to request service. When line SEL-OUT from the chan- 
nel scans at the control unit, the control unit transfers an address byte. The channel 
accepts the address and orders the control unit to proceed; meanwhile, the chan- 
nel initiates and executes the dump routine. The control unit now transmits a data 
byte; the channel accepts the data and initiates and executes the undump routine. 
The control unit now terminates the single byte transmission. 


Assume that DUMP and UNDUMP are the control signals to activate dump 


and undump routines, respectively, and that register DTS initiates the data transmis- 
sion sequence. The signal sequence on the interface for a data transmission sequence 
is described below. 


Comment, a byte-mode Data Transmission Sequence on the multiplexor 


channel (10.4) 
Comment, the control unit requests service from channel 
/DTS/ REQ-IN—1, 


Comment, the channel scans the interface 

/REQ-IN/ SEL-OUT —1, 

Comment, the control unit transmits an address byte 

/SEL-OUT/ BUS-IN<“address byte," ADR-IN<—1, OP-IN<1, REQ-IN 
<0, 

Comment, the channel accepts and orders to proceed 

/ADR-IN/ CMD-OUT —1, SEL-OUT-—0, DUMP-—I, 

Comment, the control unit drops address in signal 

/CMD-OUT/ ADR-IN<0, 

Comment, the channel drops command out signal 

/ADR-IN’/ CMD-OUT<0, 


Sec. 10.3 1/0 Interface 475 


Comment, the control unit transfers a data byte 

/CMD-OUT'/ BUS-IN —*data byte," SRV-IN<1, 

Comment, the channel acknowledges and performs undump 
/SRV-IN/ SRV-OUT<1, UNDUMP --1, memory<—“data,” 
Comment, the channel and control unit terminate the transmission 
/SRV-OUT/  SRV-IN<0, OP-IN-—0, 

/SRV-IN’/ SRV-OUT-—0, 


When any I/O operation (except test I/O and no-op) has proceeded to its normal 
end the control unit transmits another status byte to the channel to ask for termi- 
nation. The channel signals line SRV-OUT to indicate its acceptance of the status 
byte and resets the operation. (The channel may signal line CMD-OUT to cause the 
control unit to stack the status byte.) The signal sequence on the interface for an 
ending sequence, initiated by register END, is described below. 


Comment, description of the ending sequence (10.5) 
Comment, the control unit requests service from channel 

/END/ REQ-IN-—1, 

Comment, the channel scans the interface 

/REQ-IN/ SEL-OUT-1, 

Comment, the control unit transmits an address byte 


/SEL-OUT/ SEQ-IN-—0, OP-IN-—1, ADR-IN<1, BUS-IN-—"address 
byte," 


Comment, the channel accepts and orders to proceed 
/ADR-IN/ CMD-OUT--1, SEL-OUT —0, 
Comment, the control unit drops address in signal 
/CMD-OUT/ ADR-IN<0, 

Comment, the channel drops command out signal 
/ADR-IN’/ CMD-OUT -—0, 

Comment, the control unit transmits the status byte 
/CMD-OUT'/ BUS-IN —'"status byte," STA-IN-—1, 
Comment, the channel acknowledges the status byte 
/STA-IN/ SRV-OUT-—1, 

Comment, the channel and control unit terminates the transmission 
JSRV-OUT/  STA-IN-—0, OP-IN —0, 

/STA-IN'/ SRV-OUT —0, 


476 Сһар.10 CHANNEL ORGANIZATION 


10.4 Selector Channel 


A selector channel allows model 40 to use high-speed I/O devices. The high 
data rate is obtained by using separate registers, instead of the CPU registers, to 
perform channel functions, and by using sequential logic control to replace the lengthy 
microprogram routines. As many as two selector channels can be attached to model 
40. 


10.41 Unit Control Word 


The UCW is stored in local storage. There is only one UCW for each selector 
channel. The UCW for SCH 1 occupies locations 32-37, while the UCW for SCH 
2 occupies locations 48-53. The format for the SCH UCW is shown in Table 10.7. 


TABLE 10.7 The UCW Format for Selector Channel 


LOCATION DESCRIPTION 
UCWO0(0-15) Dump area for register A 
UCWI1(0-15) Dump area for register D 
UCW2(0-15) Next CCW address 
UCW3(0-15) Refill CCW address on write 
UCW4(0-15) Interrupt buffer 
UCWS(0-15) Working space 


There are six halfwords. UCWO and UCWI store the contents of registers А and D, 
respectively. UCW2 stores the next CCW address which is the current CCW address 
from the CAW incremented by eight. UCW3 stores the refill address on write for use 
during data chaining. UCW4 stores the interrupt buffer whose format is shown in 
Table 10.8. The first byte of the buffer contains zeros or flags; the second byte con- 
tains the unit number of the device currently working on the channel. UCWS is a 
general working area. 


TABLE 10.8 The interrupt Buffer Format for Selector 


Channel 
Bits DESIGNATION 
0-3 Not used 
4 Channel control check on logout 
5 Attention on device-end. 
6 End ` 
7 PCI 


8-18 Unit number 


Sec. 10.4 Selector Channel 477 


10.4.2 Data Flow 


The selector channel uses separate registers to hold data and control information, 
as shown in the configuration in Fig. 10.15. Register S holds the main storage data 


BUS-OUT BUSP В050 


Selector channel 
interface 


BUSALU 


ROSCAR 


BUS-IN 


Fig. 10.15 Selector channel configuration for the !BM/360 
model 40 


address obtained from the CCW; this eliminates the need and thus the time to dump 
the А register. The S register has two bytes in addition to three extension bits. Register 
T holds the byte count during data transfer operations and thus allows rapid determi- 
nation of byte count. The T register also has two bytes. А group of five one-byte 
registers make up the byte buffer to provide buffering between the main storage 
and the interface lines. These registers serve the need for local storage access during 
data service. Channel-flag register CF and checks-and-status register CS hold the 
data for controlling the channel. They allow rapid control actions and again eliminate 
local storage access. Register KEY, channel storage-protection key register, is used 
to hold the key in the CAW. Register ROSCAR, address register of the read-only 
storage for selector channel, is provided for rapid access of the read-only storage. 

Main-storage buffer register D, however, must now perform the same function, 
because sufficient time exists during the main storage operations to dump the D 
register contents. Thus, the D register must be dumped during memory access for 
data service. 


Chap. 10 CHANNEL ORGANIZATION 


The above mentioned registers and buses for the selector channel are now 


Comment, configuration of a selector channel 


Registers 


Register, 
Casregister, 


Register, 


Casregister, 


Register, 


Register, 


Register, 


described by the following statements: 


(10.6) 


WO(P,0-7), $byte buffer 
W1(P,0-7), $byte buffer 
W2(P,0-7), $byte buffer 
W3(P,0-7), $Буе buffer 
WA4(P,0-7), $byte buffer 
ТО(Р,0-7), 

T1(P,0-7), 

Т--Т0-ТІ, $count register 
SX(P,5-7), $3 extension address bits 
SO(P,0-7), 

S1(P,0-7), 

5--5Х-50-51, $MS data address 
CF(0-7), $channel-flags register 


CF(CDA)=CF(1) 
СЕ(СС)--СЕ(1) 
CF(SLI)=CF(2) 
CF(SKIP)= CF(3) 
CF(YCH1)=CF(4) 
CF(YCH3)=CF(5) 
CF(RW)=CF(6) 
CF(RBD)=CF(7) 
CS(0-7), 
CS(PCI)=CS(0) 
CS(WLR)=CS(1) 
CS(RGC)=CS(2) 
CS(PTC)=CS(3) 
CS(CDC)=CS(4) 
CS(CCC)=SC(5) 
CS(ICC)=CS(6) 
CS(CC)=CS(7) 
KEY(P,4-7), 


$chain data address flag 

$chain command flag 
$suppress length indication flag 
$skip flag 


$channel checks and status register 
$program controlled interrupt bit 
$wrong length record bit 
$program check bit 

$protection check bit 

$channel data check 

$channel control check 

Sinterface control check 

$chain check 


$channel storage-protection key register 


Sec. 10.4 Selector Channel 479 


Register, ROSCAR(12-0),  $read-only-storage address register for SC 
Bus, BUSP(0-7,P), SALU P bus 

BUSQ(0-7,P), SALU Q bus 

BUSALU(0-7,P,  SALU output bus 

BUSRO(0-7,P), Sfirst byte of bus R 

BUSR1(0-7,P), $second byte of bus R 


The buffering of incoming and outgoing bytes by the byte buffers is accomplished 
as follows. For a read operation, a byte, received from the I/O device, is held in buffer 
W4 and then passed down to buffer WO. The next incoming byte is received and 
passed down to buffer W1. These two bytes are then transferred from buffers W1 and 
WO to register D from which they are stored into main storage at the location speci- 
fied by the S register. For a write operation, two bytes are read out of main storage 
and transferred into buffers W4 and W3; they are then passed down to buffers WO 
and МІ. These two bytes are then transmitted to the control unit one byte at a time. 

The storage protect (SP) key, provided by the CAW, is held in the channel key 
register KEY. Whenever main storage is accessed to store a byte of data during 
a selector channel operation, the SP key is-read out of storage protect local storage 
(SPLS) to the SPLS data register. The contents of the two registers are compared; 
and if a mismatch occurs, an error indication is generated. 


10.4.3 Data Service Operation 


As an illustration of data service, a write operation by the selector channel is 
described. The flowchart of this write operation is shown in Fig. 10.16. As shown, 
the channel is initialized by the STO instruction. The CAW is fetched and the storage 
protect (SP) key is removed and inserted in the channel key register. The CCW is 
then fetched and the next CCW address is placed in the UCW in local storage since 
this address is not frequently used. The unit address from the start I/O instruction is 
placed in the byte buffer for transfer to the I/O control unit during initial selection. 

The contents of the CCW are now distributed to the registers, with the count 
to the T register, the data address to the S register, the command to byte buffer W 
for transfer to the control unit at the proper time, and the flags to the CF channel 
flags register for control of subsequent channel data service operation. 

This write operation begins with a proper starting address in address register 
ROSCAR which then takes control away from address register ROAR for the read- 
only storage and branches to the selector channel data service microprogram. This 
microprogram dumps the D register, reads out the data addressed by the S register, 
updates the S register and the count in the T register, and undumps the D register. 
This follows by the data transfer on the I/O interface. Channel operation now con- 
tinues with more I/O control unit service requests and channel responses until the 
count is equal to 0, when the I/O device is instructed to stop. АП of these operations 


Entry 


Fetch CAW 


Transfer storage protect 
key to register KEY 
у Fetch CCW 
Place next CCW 
address in LS 
Place unit address from 
SIO in byte buffer 


T<"‘count”’ 
S-''data address’ 


byte buffer—’’command”’ 
CF<“‘flags’’ 


Insert start address 
into register ROSCAR 


Dump register D 


Read out of main storage 
located by data address 


Update register S. 
Inc. count in register T. 
Undump register D. 


Interface 

data transfer 
Count=0 

т | Count=0 | 


Receive status byte 


Insert 1st address of 
ending-status mpgm to ROSCAR 
Generate an 1/О interrupt ` 


Fig. 10.16 Flowchart showing a write operation by the selector 
channel 3 


480 


Sec. 10.4 Selector Channel 481 


use standard interface sequences. The control of these data transfers is largely done 
by sequential logic rather than a microprogram. 

When ending status arrives, the status byte is received and a proper address is 
placed into register ROSCAR to enter the ending status microprogram. A normal 
I/O interrupt is then generated. 

Since most of the operations described above are controlled by sequential logic 
rather than by microprograms, there is far less interference with the execution of the 
CPU program and, consequently, the data service to I/O devices is accomplished 
faster. 


10.4.4 Start 1/0 


The start I/O (SIO) instruction causes the channel to perform a read, write, 
read-backward, control, or sense operation at the device specified in the SIO instruc- 
tion. It performs the following functions. It selects the channel, the control unit, and 
the I/O device. It issues the command. It loads channel control information obtained 
from the CCW into the channel registers to enable the channel to perform the par- 
ticular operation on the specified device. It also sets the condition code in stats Y2 
and Y3. If the SIO is initiated successfully, the condition code is set to 0; the CPU 
then fetches the next instruction. The channel operation now continues in parallel 
with the CPU program, interrupting the CPU program only when access to main 
storage is required to enter or remove data bytes from the channel buffer. 

The start I/O operation is shown in the flowchart of Fig. 10.17. As shown, if 
the CPU is not in the supervisor state, the program check interrupt microprogram 
is entered and the SIO is suspended. If the CPU is in a supervisory state but the 
channel is not a valid selector channel, the condition code is set to three and the 
next instruction is fetched. If the channel is a valid selector channel, but is busy with 
a previously initiated operation or with a stacked interrupt, the condition code is 
set to two and the next instruction is fetched. If the valid selector channel is free, the 
CAW is fetched and the CCW will be obtained to load the control information (com- 
mand code, data address, count, and flags) into channel registers. The interface is 
cleared for initial selection. When it occurs, the device number and busy code are 
set in UCWA of the interrupt buffer in local storage to signify that the channel now 
becomes busy with an operation. 

The CCW is read out of the main storage. The command code is stored in 
UCWS in the local storage and is tested for validity. If it is invalid, the program check 
bit is set. If it is valid, it is tested to see whether it is a TIC command. If it is a TIC 
command and if it comes from the first CCW, the program check bit is set because 
it is invalid to specify a TIC command as the first CCW in a SIO. A TIC command 
is valid only when it is entered from command chaining (1.е., not from the first CCW) 
and then it must not specify another TIC command. If it is a valid command or a 
proper TIC command, the data address is then stored in register А and in UCW3 in 
the local storage; the count is stored in register T. If the count is 0, the program 
check bit is set to 1. The flags are stored in flag register CP. 


482 


ЗО entry 


Supervisor No Program check 
state? interrupt mpgm 
Yes 
| № Set condition 
? 
Channel valid? 
x Yes 
Channel type MCH Enter MCH SIO 
mpgm 
SCH 
Channel busy? Yes Set condition 
code to 2 
No 
Fetch CAW 
Clear interface for 
initial selection 
Fetch next 
instruction 


Set device number and 
busy code in UCW4 


Command 
chaining entry 


Store command 
in UCW5 


Fetch CCW 


Invalid command | Yes Set program 
code check bit 
No 


? 


Fig. 10.17(a) Flowchart showing the execution of the SIO 
instruction > 


Store data address 
in register A & UCW3 
Store count in 
register T 
а Yes Set program 
Store flags in 
register CF 
Set next CCW 
address in UCW2 


Program check | Yes. | Reset interrupt 
bit set? request bit 


DE = Device-end selection commands 
Com-C = Command chaining 
Receive status 
byte 
address 
Control unit 
Receive 
status byte 


Branch 
on status 


status 


CE 
and 
Com-C 


Ind. to device 
command chaining 
Set condition 
code to O 


status 


CE, DE, 
and 
Com-C 


Enter command 
chaining mpgm 


Fetch next 
instruction 


Fig. 10.17(b) 


| Load status in 

Load status into CSW CSW into MS 

Set condition Set condition 
code to 1 code to 1 


483 


484 Сһар.10 CHANNEL ORGANIZATION 


The program check bit is now tested. If it is set, it causes the interrupt-request 
bit to be reset, the status part of the CSW status to be loaded into the main storage, 
and the condition code to be set to 1. This setting of condition code 1 is used to inform 
the CPU that the SIO failed; the reason for failure may be ascertained by inspecting 
the CSW status. 

With no program check, the channel issues commands and address to select 
the device and receives a status byte. If the control unit is busy, the status is loaded 
into the CSW, the condition code is set to 1, and the next instruction is fetched. If 
the control unit is not busy, the unit address is checked and a status byte is received. 
If the status byte is zero, the condition code is set to 0, and the next instruction is 
fetched. If the status contains channel-end signal with command chaining specified, 
the channel indicates to the device that command chaining will follow when the 
channel receives device-end signal. The condition code is set to 0, and the next instruc- 
tion is fetched. If the status contains both channel-end and device-end with command 
chaining specified, the command chaining microprogram is entered; the condition 
code is set to 0, and the next instruction is fetched. If the status is any other status, 
the status in the CSW is loaded into main storage; the condition code is set to 1; and 
the next instruction is fetched. 


10.4.5 Other 1/0 Instructions 


The test channel (TCH) instruction determines the status of an addressed channel 
and sets the condition code in the PSW as shown in Table 10.3. The TCH operation 
is shown in the flowchart in Fig. 10.18. It first tests the condition of the instruction 
which was issued in the supervisory state; this is accomplished by checking bit 15 of 
the PSW. If the PSW bit 15 is 1, it indicates the problem-program state, and the 
execution 15 transferred to the program-check interrupt microprogram. If the PSW 
bit 15 is 0, the validity of the channel number is checked. If the channel number is 
invalid, the condition code is set to 3 to indicate that the channel is not available. 

When the channel is found valid, the execution is branched according to the 
SCH or MCH. If it is a SCH, the UCW in local storage is examined to determine 
whether the channel is busy. If the channel is not busy, the SCH interrupt-request 
register is tested. If this register is 1, the condition code is set to 1 to signify an inter- 
rupt pending. Otherwise, the condition code is set to 0 to signify that the channel 
and the unit are available. In case the channel is busy because of an interrupt stacked 
in the channel (1.е., an interrupt pending), the condition code is set to 1. If the channel 
is busy because of working (no interrupt pending), the condition code is set to 2. In 
all cases, the next instruction is fetched. The test channel instruction is normally 
followed by a conditional instruction which causes a branch in the program, depend- 
ing on the condition code. 

The operations of the halt I/O and test I/O instructions in the SCH are similar 
to those in the MCH;; these operations will be described in the next section on multi- 
plexor channel. 


Sec. 10.5 Multiplexor Channel 485 


TCH entry 


Supervisory 
state? 


Yes 


Program check Valid 
interrupt mpgm channel 


MCH or SCH 


Interrupt 
pending? 


Interrupt 
pending? 


Set CC to 3 


Set CC to 2 Set CC to 1 Set CC to 0 


Fetch next 


instruction 


Fig. 10.18 Flowchart showing the test channel operation 


10.5 Multiplexor Channel 


The IBM System/360 model 40 computer is a medium-size digital computer. 
In order to make the system less costly, the multiplexor channel or MCH shares all 
the CPU registers and buses except for several additional registers which handle 


486 Chap. 10 CHANNEL ORGANIZATION 


buffering and interface. The microprogram which operates and controls the MCH 
is the distinguished part of the multiplexor channel. The MCH operates in two modes, 
the burst mode for one high-speed device and the byte mode (or the multiplex mode) 
for many slow-speed devices. In the byte mode, once started, the CPU becomes dis- 
connected from the channel; the CPU and the MCH are then operated simultaneously, 
except at the time when the main memory is being accessed by the MCH. In addition, 
there is an overhead time in switching between the CPU and the MCH. In the burst 
mode the MCH remains connected to the CPU for the entire I/O operation; there 
is no concurrent operation. | 

There are four I/O instructions: start I/O (SIO), test I/O (TIO), halt I/O (НЮ), 
and test channel (TCH). The execution of these I/O instructions is done in part by 
the CPU microprogram and in part by the MCH microprogram. This section now 
describes channel operations in executing I/O instructions after the unit control word 
is first introduced. 


10.5.1 Unit Control Word 


Each subchannel of the MCH controls one I/O device. All information necessary 
to sustain the operation of an I/O device by a subchannel is contained in a unit control 
word, UCW. The UCW’s for the subchannels are stored in the mpx storage. The 
UCW format is shown in Table 10.9. Each UCW occupies 16 bytes of which six are 


TABLE 10.9 Тһе UCW Format for the Mlultiplexor 


Channel 
Birs DESIGNATION 

ОСУУО(0-15) Count 
UCW2(0-15) Data address except at the end time 
UCW2(0-7) Unit number at the end time 
UCW2(8-15) Unit status at the end time 
UCWA(0-4) Flags 

(0) Chain data address (CDA) 

(1) Chain command (CC) 

(2) Suppress length indication (SLI) 

(3) Skip 

(4) Program controlled interrupt (PCI) 
UCW4(5-7) Op-code 
UCWA(8) Zero-count indicator 

(9) End-status-reached indicator 

(10-15) Extension of data address 
ОСУУ6(0-15) New CCW address 
UCW8(0,4,5,7) Not used 

(1) Wrong length record (WLR) 

(2) Program check 

(3) Protection check 

(6) Interface control check (ICC) 


(8-15) Extension of new CCW address 


“ 


Sec. 10.5 Multiplexor Channel 487 


not used. Since there can be a maximum of 128 subchannels, the mpx storage has a 
capacity up to 1,024 halfwords. The UCW collects the device address, the CCW 
address, the protection key, the command code, the flags, the data address, the count, 
and the status from the same sources as those of the SCH. 


10.5.2 Test Channel 


The operation of the test channel (TCH) instruction for the MCH is similar to 
that for the SCH, as has been shown in the flowchart in Fig. 10.18. Briefly, if the 
CPU is in the supervisory state, if the channel is valid, and if the requested channel 
is the MCH, the MCH interrupt-request register is tested. If this register is 1, the 
condition code (Table 10.3) is set to 1 to signify an interrupt pending; otherwise, the 
condition code is set to 0 to signify that the channel and the unit are available. 


10.5.3 Halt I/O 


The halt I/O (HIO) instruction causes the device to stop immediately or to stop 
when it next requests service. It is a priviledged operation. This instruction, on the 
multiplexor channel, is effective only while working in the byte mode. The HIO 
operation is shown in the flowchart of Fig. 10.19. It first tests if the state is the super- 
visory state and if the channel is valid and if the subchannel is busy. If the channel 
is not valid, the condition code (Table 10.3) is set to 3 and the next instruction is 
fetched. 

If the subchannel is busy or holding an interrupt (end status reached), the con- 
dition code is set to 0 and the next instruction is fetched. 

If the subchannel is not busy and not holding an interrupt, an operation is in 
progress on the selected subchannel. The device is connected to the channel and its 
operation is halted. If the subchannel is not busy, the device is also connected to 
the channel in order to clear a possible stacked device-end interrupt. 

At this point, if the device selection indicates that the device is not available, 
the condition code is set to 3 and the next instruction is fetched. 

If the device is available and if the control unit is not busy, the channel issues 
a halt I/O command to halt the device operation. The count-zero flag in the UCW 
is set; the status is stored in the CSU and the condition code is set to 1. If the device 
is available and if the control unit is busy, the channel issues a stop control sequence 
to stop the device the next time it requests service. Again the count-zero flag is set; 
the status is stored in the CSW, and the condition code is set to 2. In either case, the 


next instruction is fetched. 


10.5.4 Test 1/0 


The test I/O instruction (TIO) tests the state of an I/O device and sets the condi- 
tion code accordingly. The TIO operation is shown in the flowchart in Fig. 10.20. 


HIO entry 


Supervisory 
state? 


Program check 
interrupt mpgm 


Yes 


Valid 
channel? 


Yes 


Y 
SUB-CH busy? = 
Busy witha 
stacked interrupt 


CH select 
device 


Device 
available? 


CU busy? 


Issue stop 
control sequence 


Set count-zero 
flag in UCW 


Issue halt 
МО command 


Set count-zero 
flag in UCW 


Load status 
into CSW 


Set CC to 2 


Load status 
into CSW 


Set CC to 1 


Fetch next 
instruction 


Set CC to 3 Set CC to 0 


Fig. 10.19 Flowchart showing the halt I/O operation 


488 


Sec. 10.5 Multiplexor Channel 489 


TIO entry 


Program check 
interrupt mpgm 


No 


Yes 
CH-end reached? Sub-CH busy 


Yes 
End reached on Chcel | 


Yes 
Is device status 
available? 
No 
CH select device 
CH issues a 
TIO command 
CH receives 
status byte 
Store status 
in CSW 
Set CC to 1 


Device 
available? 
CH issues a 
ТІО command 
CH receives 
status byte 


Set CC to З 


Set CC to 2 Set CC to 0 


Fetch next 
instruction 


Fig. 10.20 Flowchart showing the test I/O operation 


The operation begins by testing whether the CPU is in the supervisor state. If it is 
not in the supervisor state, the CPU enters the program check interrupt micropro- 
gram; the latter sets the priviledged operation bit of the interrupt code of the current 
PSW to 1, stores this PSW in main storage,-and fetches a new PSW from main storage. 

If the CPU is in the supervisory state, the channel validity is tested next. If the 


490 Сһар.10 CHANNEL ORGANIZATION 


channel is not valid or not operational, the condition code (Table 10.3) is set to 3 
and the next instruction is fetched. 

If the CPU is in the supervisory state and if the channel is valid and if the sub- 
channel is not busy then the channel selects the I/O device, issues a test I/O command 
(not the test I/O instruction), and receives a status byte from the device. If the status 
byte indicates that the device is not available, the CSW is stored, the condition code 
is set to 1, and the next instruction is fetched. If the device is available, the condition 
code is set to 0 and the next instruction is fetched. | 

If the channel is valid but Фе subchannel is busy, the channel-end signal from 
the control unit is tested. If the channel-end signal is absent, this indicates that the 
device is working and the condition code is set to 2. If the test I/O command has 
reached the end and if the device status at the end time is available in the subchannel, 
the status is stored in the CSW and the condition code is set to 1. If the device status 
is not available in the subchannel, the channel selects the device, issues a test I/O 
command, and receives a status byte from the device. The status is stored in the CSW 
and the condition code is set to 1. In all cases, the next instruction is fetched. 


10.5.5 Start ИО 


The start I/O instruction initiates a read, write, read-backward, control, or sense 
operation at the addressed I/O device according to the command code in the CCW 
bits 0-7. Like the SIO executed in the SCH, the SIO in the MCH selects the channel, 
the control unit, and the I/O device. It issues the command. It sets up the UCW and 
stores it in mpx storage. It sets the appropriate condition code in stats Y2 and Y3. 
If the MCH is in the byte mode, the channel disconnects from the CPU, and the 
CPU fetches the next instruction in sequence. 

The SIO operation is shown in the flowchart in Fig. 10.21. The CPU reads the 
CAW out of the main storage. The CAW gives the address to fetch the first CCW 
from main storage. The CCW specifies the operation to be performed, the main 
storage area to be used, the action to be taken upon completion of the operation, 
and the number of bytes to be handled. The device is then selected and the command 
from the CCW is issued to the device. The UCW is formed and stored in the mpx 
storage. If the selected device works in the byte mode, the condition code is set to 
0 to indicate that the channel is available and not busy. The CPU disconnects from 
the channel and then operates in parallel with the channel. The channel now interrupts 
the CPU program only when it needs to store or fetch data from the main storage. 

In the case of the byte mode, when the control unit requires data service (1.е., 
access to main storage), a microprogram interrupt is generated. This interrupt sus- 
pends the CPU microprogram and initiates a “dump” which stores the contents of 
all CPU registers in local storage. The dump enables the CPU registers to act as 
the channel in servicing data to or from the control unit. The data and control infor- 
mation of the subchannel is now loaded from the UCW in the mpx storage into the 
CPU registers. The data service is performed by accessing main storage using the 
data address from the UCW. The data are transferred to or from the I/O control 


SIO entry 


LS = Local storage 

MS = Мат storage 

СН = Channel Readout CAW 
СУ = Control unit Fetch CCW 


СС = Condition code 

MCH = Multiplexor channel 

mpgm = Microprogram Select device 
issue command 


Form and 
store UCW 
Continue CPU | No = 


Disconnect 
CPU from CH 


Status or 
CH-end signal 
Suspend CPU 

mpgm 


Load CPU 
with UCW 


MS access ES 


Update count & 
data address 
Status or 
CH-end signal 

Yes 


Yes 


Continue CPU 
mpgm 
Undump LS 
to CPU 
Restore UCW to 
mpx storage 
Update count and 
data address 


Load CPU with 
UCW from mpx 


storage 
CH-end signal 
Yes 


Initiate an No Y 


Yes 


Device end 
signal 


Fig. 10.21(Part 1) Flowchart showing the SIO operation 


Fetch new CCW 


491 


Interrupt pending? 


Yes 


CH accept 
interrupt 


Load end bit into Stack interrupt 
Form CSW interrupt buffer status in CU 
Store CSW into MS Mask set 
to allow? 
Store CH-CU address Set maskable 
in 1/0 old PSW interrupt ibit 
Store current PSW Restore UCW 
as I/O old PSW to mpx storage 


Move new 1/0 PSW in 
MS to current PSW 


Byte 
Mode 
location in LS 
Execute SECO cid 
interrupt program et LL to ndump 


Continue CPU 
program 


Inst. fetch? 
es 
Is mask-allow No 
indicator set? 


Y 
х Yes 


Fig. 10.21(Part 2) 


492 


Sec. 10.5 Multiplexor Channel 493 


unit, and the count and data address fields of the UCW are updated. The updated 
UCW is replaced in mpx storage. Then an “undump” occurs in which the CPU 
registers are restored to the state existing prior to the microprogram interrupt. The 
servicing of one segment of data is completed and the CPU instruction that was 
suspended continues to completion. Further segments of data for transmission 
between the I/O device and main storage are handled in a similar manner, each 
transfer being initiated by one I/O interrupt, until the channel-end signal is sensed 
by the channel. 

In the case of the burst mode, the data and control information of the subchannel 
are also loaded from the UCW in the mpx storage into the CPU registers. The CPU 
does not disconnect the channel but continues to act as the channel. Whenever the 
data are ready for transfer from the device, the CPU performs the data service and 
updates the count and data address fields of the UCW. There is no need of “undump.” 
The data transfer continues until the channel-end signal is sensed by the channel. 

At this time, whether the channel is in the byte or the burst mode, the chaining 
flags in the CCW are examined to determine chaining. If there is chaining, the chan- 
nel, after receiving the device-end signal from the control unit, fetches the next CCW 
and executes the command. This process continues until all the CCW’s of the SIO 
are executed. An interrupt is next initiated, and the channel is ready to terminate 
the operation. 

The channel accepts the interrupt if there is no previous interrupt awaiting 
service. At this time, the system mask is examined to determine if this interrupt on 
this channel is to be allowed. If the channel is masked to allow, a maskable interrupt 
bit is set. If the channel does not accept the interrupt, the channel requests the control 
unit to stack the status byte in the control unit. Whether the channel accepts the 
interrupt or not, it restores the UCW to the mpx storage. 

If the channel is in the byte mode, an undump is performed; if it is in the burst 
mode, the condition code is set to 0. In either case, the CPU program that was in 
progress until the acceptance of the I/O interrupt now continues. If the CPU is not 
fetching an instruction or if it is but the mask-allow indicator is not set, the CPU 
continues to execute the program. If the CPU is fetching an instruction and if the 
mask-allow indicator is set, the I/O interrupt microprogram is entered and a CSW 
is formed. This CSW gives the status of the channel and the device at the completion 
of the I/O operation just performed and is stored in main storage. Furthermore, 
the channel-unit address becomes the interrupt code of the current PSW and the 
current PSW is stored as the 1/0 old PSW. This PSW will indicate the channel and 
unit that caused the interrupt, and the CSW will indicate why the interrupt took 
place. А new 1/О PSW is fetched from main storage and loaded to the current PSW 
location in local storage. At this point, the automatic handling of an I/O interrupt 
by the microprogram ceases, and the instruction counter of the new PSW gives the 
starting address of an interrupt subroutine. 


494 Chap. 10 CHANNEL ORGANIZATION 


References 


1. Eckert, J. P., JR., WEINER, J. R., WELSH, H. F., and MITCHELL, H. F., “Тһе UNIVAC 
System,” Review of Electronic Digital Computers, Joint АТЕЕ-ТВЕ Computer Confer- 
ence, February, 1952. 


2. BUCHHOLZ, W., “The System Design of the IBM Type 701 Computer,” Proceedings of 
the IRE, October, 1953, pp. 1262-1275. 


3. Бокс, J. W., “Тһе Lincoln TX-2 Input-Output System,” Proceedings of J.W.C.C., 
1957, рр. 156-160. 


4. BucuHorz, W., ed., Planning а Computer System. New York: McGraw-Hill Book 
Company, 1962. 


5. FaAGG, P., Brown, J. L., Hirr, J. A., and Ооору, D. T., “IBM System/360 Engineering,” 
Proceedings of Fall Joint Computer Conference, 1964, pp. 205-231. 


6. PADEGS, A., “The Structure of System/360, Part IV, Channel Design Considerations,” 
IBM Systems Journal, 3, Nos. 2 and 3, 1964, pp. 165-180. 


7. BLAAUW, С. А., “The Structure of System/360, Part V, Multisystem Organization,” 
IBM Systems Journal, 3, Nos. 2 and 3, 1964, pp. 181-195. 


8. Hassitr, A., Computer Programming and Computer Systems. New York: Academic 
Press Inc., 1967. 


9. GSCHWIND, H. W., Design of Digital Computers. Springer-Verlag New York Inc., 1967. 


10. Сно W. W., “А Study of Synchronous Time Division Multiplexing for Time-sharing 
Computer Systems," Proceedings of the FICC, 1969. 


11. PHORNTON, J. E., Design of a Computer: the Control Data 6600. Glenview, Illinois: 
Scott, Foresman and Company, 1970. 


The previous chapters have shown that the hardware can be implemented by 
means of microprogramming. Implementation of hardware by microprogramming 
has many advantages, two of which are: flexibility of the implementation and 
similarity to the software. Implementation flexibility allows, within limits, the use of 
a different microprogram for a different algorithm without changing any hardware 
if the control memory is a changeable read-only memory or a read-write memory. 
Software similarity provides for application of programming techniques to micro- 
programming. 

The previous chapters have also shown that the hardware implements algo- 
rithms just as the software does. Therefore, the software can also be implemented 
by microprogramming. Implementation of software by microprogramming gives 
two major advantages: retainment of flexibility of the implementation and increase 
in speed of execution. This chapter introduces the idea of microprogramming 
software by an example. The example shows in detail the microprogramming of 
translating a relocatable code into an executable code; this translation is a part of 
the software called the loader. 


Microprogramming Software ТІ 


11.1 Translation of Relocatable Code into 
Executable Code 


A computer system executes programs. Each complete program is often called 
a job. In a batch computer system, the jobs are stacked together into a batch and are 
then processed by the computer system one after another until the batch is exhausted. 
Each job usually consists of a number of subprograms. A subprogram can be a main 
program, a subroutine, or an independent section of code. These subprograms can 
be in one or more source languages, which can be a compiler language, an assembly 
language, or a binary code. In order that the entire program of the job can be executed 
by the computer, it must be put together into an integrated program in absolute 
address and loaded into the main memory. The process to accomplish this result 
may be divided into three phases. The first phase, called assembly, translates each 
subprogram into a relocatable code; the job now consists of a collection of relocatable 
codes. The second phase, called linkage editing, links these relocatable codes into 
one relocatable code. The third phase, called loading, translates the relocatable code 
into an executable code and stores the executable code into the main memory. In order 
to partially illustrate microprogramming software, translation by hardware of a relo- 
catable code into an executable code is described. For convenience, the word /oader 
is to be used to denote this translator. 

Owing to the relatively small capacity of the main memory in most computers, 
a program (i.e., its executable code) larger than the capacity of the main memory 
can not be entirely loaded into the main memory. A common technique 1$ to use 
overlay. By оуеғ/ау we mean that the programmer divides his program into a number 
of independent memory loads (each of which must be smaller than the memory 
capacity) by using special control cards. The currently active memory load can call 
in another core load to overlay itself. One memory area is refrained from being 
overlayed so that this area can contain information to be passed from memory load 
to memory load. For simplicity, it is assumed that no overlay is required for the 
loader now described. 


11.1.1 Relocatable Elements 


The input to the loader is one or more relocatable elements. Each relocatable 
element consists of two parts, (a) relocatable code, and (b) symbolic address tables. 


496 


Sec. 11.1 Translation of Relocatable Code into Executable Code 497 


The relocatable code is the intermediate form of the machine language instructions 
and data that result from the assembly of an assembly language subprogram. A 
machine language instruction is usually made up of a non-address part (e.g., the 
op-code part) and an address part. Since relocatable translation requires only the 
adjustment of addresses, it is only necessary to distinguish between the address part 
and the non-address part of the instruction. Therefore, a relocatable code is regarded 
as consisting of a sequence of relocatable words. Each relocatable word contains an 
address part or a non-address part of an instruction. In this context, a data word can 
be viewed as an instruction without an address part. 

The relocatable word formats of the chosen example are shown in Fig. 11.1. 
Each format has one to five fields: OP, FS, IJ, FLD, and INC. The FLD field con- 
tains the data, or the address part of an instruction, or an index to a table. The FS 
field indicates the length of the FLD field in octal digit. The OP field identifies the 
FLD field as, 


1. Data (OP=1). 
2. Relative Address (OP =2). This address references a location within the subprogram 


relative to the subprogram address. The subprogram address is where the executable 
code is to be loaded. 


3. Common Data Address (OP —3). This is an external address which references a data 
area which is common to several subprograms. 


4. External Address (OP —4). This is an external address which references an entry 
point in other subprograms. 


5. End of relocatable code (OP — 5). 


The П and INC fields give a numerical increment to these symbolic addresses. If II 
is equal to 0, there is no INC field. If II is equal to | or 2, the address is incremented 
or decremented, respectively, by the contents of the INC field. 

Figure 11.2 shows an example of the relocatable code where the fields are sep- 
arated by vertical lines and the numbers are octal. Words 1, 2, 3, 4, 6, 8, 9, 11, 12, 13, 
15, and 17 are relocatable words with data (OP— 1). Words 5, 7, 10, and 14 are those 
with relative addresses (OP—2). Word 16 is the one with a common data address 
(OP—3). Word 18 is the one with an external address (OP—4). Word 19 is the one 
indicating the end of the relocatable code (ОР==5). These relocatable words are of 
different lengths. Though they are shown as left-justified, they are actually a string 
of bits stored in the memory, as shown in Fig. 11.3. 

The above relative address, common data address, and external address of a 
relocatable element are all symbolic, and are to be used to link the subprograms 
together. They are stored in the following three symbolic address tables: 


1. Defined Symbol Table (DST). This table contains the symbolic name of each entry 
point in the subprogram and its relative address in the subprogram. 


2. Undefined Symbol Table (UST). This table contains the symbolic name of each external 
address (OP—4) in the subprogram. The instruction in the subprogram which 


498 


Chap. 11 MICROPROGRAMMING SOFTWARE 


7 5 
Length of Y Data part of 
FLD field instruction 
(in bytes) or data word 
—Ó————————M 
3 4 2 3*(FS) Bits 
(a) 
OP FS 1 FLD 
d Address relative. 
(in bytes) to the subprogram 
p———— —— J —— . 
3 4 2 3*(FS) Bits 
(b) 
OP FS FLD INC 
РЕ 
Length of Index to address Possible | 
FLD field in Common Symbol increment or | 
(in bytes) Table decrement | 
NND та иене J 
3* (FS) Bits 
Length of | Index to address Possible | 
FLD field | "емет in Undefined increment or | 
(in bytes) Га Symbol Table decrement | 
а а ae J 
N a e, a 
3 4 2 3*(FS) 3* (FS) Bits 
(а) 
ОР 
(е) 
„== 
3 Bits 


Fig. 11.1 Relocatable word formats: (a) Format A, indicating 
data; (b) Format B, indicating relative address; (c) 
Format C, indicating common data address; (d) 
Format D, indicating external address; (e) Format E, 
indicating the end of relocatable code 


references the external address contains an address linking to the entry of this external 
address in the UST table. 


3. Common Symbol Table (CST). This table contains the symbolic name of each common 
data address (OP —3) in the subprogram. The instruction in the subprogram which 
references the common data address contains an address linking to the entry of this 
common data address entry in the CST table. 


2 3 4 5 6 7 8 9 10 11 12 13 14 15 


[2]5]0]0о0 o 1 o0 2 


N [= 
e| | 
о о 
о > 
> о 
о N 
© © 

© 

© 


ЕКЕ 
M fl 
БЕ 
о| |o 
| fo 
о) |o 
ol |= 
о |o 
о © 
о jo 
о 
о 
о 
о 
o 


| 


1 [14 [оо 1 3 1 0 о о о о о о 0 
[I1]7]o[o 7 7 4 0 0 4 
1]1]0]2/ 

|3І5|1|00 о 1 0|0 0 4 5 4 
а[ 5[ 0 [о o 0 2 o 


o 
D 
D 
m 
o 
т 
г 
о 
2 
о 


Fig. 11.2 Example of a relocatable code (іп octal) 


Format 
type 


499 


500 Chap.11 MICROPROGRAMMING SOFTWARE 


Ims 
о 
о 
о 
о 
o 
e 
ojo 
о 
о 


о 

о 

EN 

о 

o 

o 

o 

o 

оро 
Oo | © 


2 
о о o оо 0 о о oli 3 4 4 
o s o 0 о о of2 2 4 0 о 5 
ЖЕППЕЕКЕ ЕЕЕ 6 
BÉ 2 4 0 0 1 o oli 60 7 
от зто ооо ооо 0 8 
E з 4 0 ЖЕГІСІ ЕЙ 9 
1 о о о о о о о Offi 6 ol 11 
07 3 4 о оз о о о о 0 12 
E № 7 7 4 о о 4|2 2 13 
DEREN ии к 
5 000 1000 4 в 4] 1 15 
оа 374 2 4 о 0 о 2 ols 16 


One memory word 
(36 bits) 


Fig. 11.3 Example of a relocatable code in the memory (double 
lines separate relocatable words and all digits are 
octal) 


An example of these three tables as they appear іп the memory is shown in Fig. 11.4; 
they will be further referenced. 


11.1.2 Executable Code 


A stored-program computer is designed to execute machine instructions with 
absolute addresses (1.е., main memory addresses). A program written in such machine 
instructions is called an executable code. The output from the loader is an executable 
code. The first word of the executable code may store the absolute address at which 
the loading of the executable code into the main memory begins. 

Figure 11.5 is an example of executable code which is the output from the trans- 
lation of the relocatable code in Fig. 11.2. The instruction and data formats in Fig. 
11.5 follow those of the IBM 7090 family of computers. As shown, the first word 
contains absolute address 17000,. The executable code occupying locations 65-75 is 


Sec. 11.1 Translation of Relocatable Code into Executable Code 501 


Table link 


INPUT 
BUFFER 
(relocatable 
code) 


5000 


OUTPUT 

BUFFER 

(machine 
instruction 
sequence) 


Fig. 11.4 Example of tables and buffers in the memory 


assumed to be a part of a larger program. The loader translates the 19 relocatable 
words into 12 machine instructions because some of the machine instructions consist 
of both address parts and data parts that are described by more than one relocatable 
word. The FLD fields of the relocatable words 1, 2, 3, 8, 11, and 12 in Fig. 11.2 


502 Chap. 11 MICROPROGRAMMING SOFTWARE 


Word 65 
Word 66 
Word 67 
Word 68 
Word 69 
Word 70 
М/ога 71 
М/ога 72 


Word 73 


0 7 7 1 1 3 


Word 74 


A 
© 
> 
оо 
№ 
+ 


Мога 75 


Fig. 11.5 An executable code іп the тат memory 


become words 65, 66, 67, 70, 72, апа 73 in Fig. 11.5, respectively, because these 
relocatable words have no address part. 


11.1.3 Translation Algorithm 


The translation process first unpacks the relocatable code stored in the input 
buffer, then interprets the op-field of each word of the relocatable code, and finally 
assembles the data or the modified address into a sequence of machine-language 
instructions in the output buffer. The buffers are shown in Fig. 11.4. The translation 
process is shown in the flowchart of Fig. 11.6. As shown, the initial step places the 
subprogram address in the output buffer. The first or the next relocatable word is 
read out of the input buffer for unpacking. The unpacking process recognizes the 
boundaries of the relocatable words as well as those of the fields of each word. The 
OP field is first decoded and the address is modified as below. 


1. If OP is 1, it indicates data (format A shown in Fig. 11.1). Since it is not an address, 
no address modification is required. The FS octal digits of the FLD field are placed 
in the next available octal digit positions of the output buffer, where FS denotes the 
size of the FLD field in octal digits. 


2. If OP is 2, it indicates a relative address (format B). The FS octal digits of the FLD 


шчиобе иоце[ѕиелд əy} Buimoys yeyomojy 9° LE ‘Big 


Jajynq 1ndino aui 
ut (пәл ayy aoejd 


pue 9NI Aq ssauppe 
91n[osqe 1uauJaJ28(] 


лэнпа 1ndino aui 
ut пә ay} аэе|а 


pue ЭМ! Aq sseJppe 
aynjosge 1иәшә-ләц| 


LSN шоу хәри! 
ay) Buisn ' са әш шоц 
55әлрре ay) әләідәң 


па 1ndino 
ay} u! 311592 aui 
aəəæjd pue ssaippe 
wesboidqns eui оз 01-4 PPY 
ssaJppe әлцејәу 


ала әш 
^q Paapu! ISN eui 

UJOJJ хәри! ay} әләі дән 
ssauppe |еи1э3х3 


973 u! xepui ayy Aq 
рәзезірші 152 au шолу 
$$элрре aui эла!.3э 
ssouppe ejep иошшогу 


J8jjnq 1ndino 
Əy? ш езер aui ld 
ғаға 


V-dO 


uonjeutiulJa? 


P410M э|4ееЭ0|эл іхән aui Peay 


papeo| 
әд 0} st әоџәпбәѕ aui цоцм зе sseJppe 
aynjosge au 1э}} па 1ndino au ш Əd 


Адиз 


503 


504 Сһар. 11 MICROPROGRAMMING SOFTWARE 


field (the relative address) are added to the subprogram address (such as 17000, in 
Fig. 11.5). The sum is placed in the next FS octal digit positions of the output buffer. 


3. If OP is 3, it indicates a common data address (format C). The FS octal digits of the 
FLD field are an index. This index is an address relative to the location (such as 
2000, in Fig. 11.4) of the common symbol table (CST). The absolute address 
stored at this location is retrieved. If field I contains 0, there isno address modification. 
If field II contains 1 or 2, the contents of the INC are added to or subtracted from the 
absolute address, respectively. In any of the three cases, the resulting address is then 
placed in the next FS octal digit positions of the output buffer. ` 


4. If OP is 4, it indicates an external address (format D). The FS octal digits of the FLD 
field (which is an address relative to UST) are added to the address of the undefined 
symbol table (UST) (such as 3000; in Fig. 11.4). The index stored at this location is 
retrieved and added to the address of the defined symbol table (DST) (such as 1000; 
in Fig. 11.4). Then, the absolute address at this location is retrieved. If field II is 
0, no further modification of the address is required. If field II is 1 or 2, the contents 
of the INC field are added to or subtracted from the absolute address, respectively. 
In any of the three cases, the absolute address is placed in the next FS octal digit 
positions of the output buffer. 


5. If OP is 5, it indicates the end of the relocatable code (format E). The translation is 
terminated. 


After one of the above operations is performed, the next relocatable word is read out 
of the input buffer. Unpacking, decoding, and address modification continue on 
until the end of the relocatable code is reached. The translation is completed. This 
translation process will be described in more detail later when the sequence charts 
are presented. 

As an example, let the relocatable code in Figs. 11.2 and 11.3 be the input which 
is stored in the input buffer area of the memory shown in Fig. 11.4. Let the executable 
code in Fig. 11.5 be the output from the translation which is stored in the output 
buffer area of the memory also shown in Fig. 11.4. Word 1 in Fig. 11.5 contains the 
absolute address 17000, from which the subsequent machine-language-instruction 
sequence is to be loaded. Words 2-64 are assumed to be some other part of the sub- 
program. Relocatable words 1, 2, and 3 in Fig. 11.2 contain the data and are thus 
translated without modification into words 65, 66, and 67 of the output buffer in 
Fig. 11.4. Relocatable word 4 in Fig. 11.2 also contains the data and is translated 
without modification into the first seven octal digits of word 68 in Fig. 11.5. Re- 
locatable word 5 in Fig. 11.2 stores a relative address; thus, the contents of the FLD 
field are added to the subprogram address (17000,), and the result is then placed as 
the last five octal digits of word 68 (17102,). Words 69-74 in Fig. 11.5 are similarly 
translated from relocatable words 6-14 in Fig. 11.2. Word 75 in Fig. 11.5 is a machine 
instruction with two addresses; it is translated from relocatable words 15-18 in Fig. 
11.2. Relocatable word 15, which contains the op-code of the instruction, becomes 
the first octal digit of word 75. Relocatable word 16 соп 1$ an index (00010,) to 
the common symbol table (CST). At the eighth location relative to address CST 
(2000,) in Fig. 11.4, the absolute address is found to be 10010, whose symbolic 
address name is ARR. Since field II is 1, address 10010, is incremented by the contents 


Sec. 11.2 Configuration 505 


of field INC; the result is 10464, and is octal digits 2-6 of word 75. Relocatable word 
17, which contains the index of the instruction, is translated without modification 
into octal digit 7 of word 75. Relocatable word 18 contains an index (00020,) to the 
undefined symbol table (UST). At the sixteenth location relative to address UST (3000,) 
as shown in Fig. 11.4, the external address is found to be 5 and the symbolic name to 
be DOT. This address (0005,) is relative to the location DST (1000,) of the defined 
symbol table. At this location (1005,), absolute address 24232, is found. Since field 
II of relocatable word 18 contains 0, no address modification is required; this absolute 
address is entry point DOT. Relocatable word 19 indicates the end of the relocatable 
code. 


11.2 Configuration 


Figure 11.7 shows the configuration of the microprogrammed loader excluding 
the control part which will be shown subsequently. Main memory M has address 
register AR and storage register SR. Single-bit registers READ and WRITE are 
used to initiate a memory read or a memory write, respectively. The relocatable 
elements and the input and output buffers are stored in the memory. There are eight 
index registers, ХІ, Х2,... Х8, which store the table and buffer addresses during 
translation. Registers OP, FS, and II store, respectively, the OP field, the FS field, 
and the II field of a relocatable word. The unpacking of a relocatable word and the 
address modification of its address part are performed in registers A and B. Single- 
bit register SH indicates that register В or casregister А-В is shifted to the left accord- 
ing to register SH containing the register for calling the unpacking sequence. In 
addition, there are three counters СІ, C2, and C3. This configuration is now described 
below. 


Comment, configuration of the translator (11.1) 
Register, А(1-36), $accumulator 
В(1-51), $unpacking register 
AR(1-15), $address register 
SR(1—36), $storage register 
X1(1-15), $store the INC field 
Х2(1-15), $store INPUT address 
X3(1-15), $store OUTPUT address 
Х4(1-15), $store CST 
Х5(1-15), $store UST 
X6(1-15), $store DST 
X7(1-15), $store subprogram address 


X8(1-15), $temporary storage 


506 Сһар.11 MICROPROGRAMMING SOFTWARE 


Х1 (1-15) AR(1-15) 
X2(1-15) 
X3(1-15) S READ 

Main memory 
X4(1-15) WRITE M(0-32767, 1-36) 


X5(1-15) 


X6(1-15) 


—' 


X7(1-15) 


X8(1-15) 


1 15 16 51 


спа | | сита 


Fig. 11.7 Loader configuration 


OP(1-3), $op-register 

FS(1-4), $field size register 

П(1-2), $incrementing indicator 
СІ(І-4), R $count left shifts in casregister 
С2(1-4), $count leftshifts in register В 
C3 (1-4), $count leftshifts in register A 
SH, $shift-control register 


Sec. 11.3 Sequences 507 


UNPACK, $control register 

READ, $memory read register 

WRITE, $memory write register 
Subregister, B(ADR)—B(1-15), $address part of unpacking register 

В(ІМ)--В(16-51), $input part of unpacking register 

SR(AD)=SR(22-36), Saddress part of storage register 


Memory, М(АВ)=М(0-32767,1-36), $main memory 
Casregister, АВ(1-87)-- А-В, 


11.3 Sequences 


The translation algorithm in Fig. 11.6 is now developed into sequence charts. 
The flowchart in Fig. 11.8 shows the four sequences of the translator: initialization, 


Start 


Initialization 
sequence 


Fetch 
sequence 


sequence 


Address 
modification 
sequence 


— 
| 
| 
| 

| | 

Unpacking + 
| 
| 
| 
| 

J 


Fig. 11.8 Flowchart showing the four sequences of the loader 


508 Chap. 11 MICROPROGRAMMING SOFTWARE 


fetch, address modification, and unpacking. The sequence charts for these four 
sequences are shown in Figs. 11.9-11.12. The initialization sequence initializes the 


Start 


SR-M(AR) 


SH-1, 
UNPACK<1 
UNPACK=0 Ен 


B(IN)<SR 


OP<B(1-3), 
Е5<В (4-7), 

11<В (8-9), 
C1-3 


B(ADR)-X7, 
X2<countup X2, 
A-0 


SH<0 
UNPACK-1 


ОМРАСК=0 


Fig. 11.9 Sequence chart for Fig. 11.10 Sequence chart for 
the initialization se- the fetch sequence 
quence 


translation. The fetch sequence fetches a relocatable word, unpacks it, and decodes 
it. The address modification sequence performs the address modification. The unpack- 
ing sequence performs the task of reading a relocatable word out of the input buffer, 
shifting casregister AB to the left, and storing a machine instruction into the output 
buffer. As indicated by the dotted lines, the unpacking sequence is called during 
the fetch sequence and the address modification sequence. 

Assume that the relocatable element and the buffers are initially in the memory 
and that the relocatable element is a string of bits as shown in Fig. 11.3. The addresses 
of the tables and buffers are assumed to be in the index registers as described below. 


Entry 


{ Waiting loop } + 
UNPACK=1 


1F(SH=0)THEN(B<3 shi B) 
ELSE(AB<3 shl AB), 

IF (SH=1) THEN(C3<countup СЗ), 

C2<countdn C2, 

Ci<countdn C1 


X3<countup ХЗ, 
SR<A 
M(AR)<SR 
ОМРАСК«<0 


Fig. 11.11 Sequence chart for the unpacking subsequence 


509 


20:20 [sro» | 
X8-B(ADR) X8<B(ADR) X8-B(ADR 
B(ADR)-X8 add X7 АНВ<Х8 add X4 АВ<Х8 add X5 


5 
STOP 
4 


3 


SR<M(AR) SR-M(AR) 


X8-SR(AD) X8-SR(AD) 
ШЕВ AR<X8 add X6 


+ 


X10 SR<M(AR) 


SH<0 
ОМРАСК +1 X8<SR(AD) 


ЕЗ ОМРАСК-0 
X1<B(ADR) 

+ 

X1<X1’ 

X1<countup X1 
B(ADR)<X8 add X1 


Fig.11.12 Sequence chart for the address modification sequence 


510 


Sec. 11.3 Sequences 511 


. Input buffer location in register X2 

. Output buffer location in register X3 
. CST location in register X4 

. UST location in register X5 

. DST location in register X6 


ON л A о NY = 


. Subprogram address in register X7 


11.31 Initialization Sequence 


The initialization sequence in Fig. 11.9 performs five tasks. It reads the first 
word out of the input buffer (location in register X2) and stores it in subregister 
В(ІМ). It places the subprogram address in register X7 into subregister B(ADR). 
It increments register X2 by 1. It resets register A to 0. It sets the initial contents of 
counters Cl, C2, and C3 to be 5, 12, and 7, respectively. 

Counter СІ counts the number of leftshifts of casregister AB in octal digits. 
The shifting of the 5-octal-digit subprogram address from subregister B(ADR) to 
subregister А(22-36) is controlled by setting counter СІ to 5 and then counting down 
until it reaches 0. Counter C2 counts the number of leftshifts of register B in bytes. 
The indication to read the next word from the input buffer into subregister B(IN) 
is given by setting counter C2 to 12 and then counting down until it reaches 0. Counter 
C3 counts the number of bytes that are shifted into register A where machine instruc- 
tion is being assembled. The five leftshifts required to complete the first machine 
instruction in register А are controlled by setting counter СЗ to 7 and then counting 
up until it reaches 12. 


11.3.2 Fetch Sequence 


The fetch sequence, as shown in Fig. 11.10, performs four tasks. It shifts the word 
in subregister B(IN) the number of octal digit positions to the left indicated by counter 
СІ so that the next relocatable word is now left-adjusted in register В. By making this 
leftshift occur in casregister AB, it also shifts the address or data in the left part of 
register В into register A. It then transfers the contents of OP, FS, and П fields in 
subregister В(1-3), В(4-7), and В(8-9) to registers OP, FS, and II, respectively. Since 
these three fields in subregister В(1-9) are of no. further use, register В is leftshifted 
three octal digit positions so that the FLD field of the relocatable word is left-adjusted 
in register B. 

In the above tasks, there are two left shifts; one in register B and the other in 
casregister AB. These two shifts are indicated by register SH which contains 0 and 1, 
respectively. Such a leftshift is also required in the address modification sequence. 
For convenience, a subsequence called the unpacking subsequence is formed. This 
subsequence is “called” by setting register UNPACK to 1 and “returns” to the calling 


512 Сһар.11 MICROPROGRAMMING SOFTWARE 


sequence by resetting register UNPACK to 0 in the subsequence. After the sequence 
register UNPACK is set to 1, the sequence constantly examines register UNPACK 
and waits for its contents to become 0. When register UNPACK is being set to 1, 
register SH should also be set to 0 or 1 in order to select one of the two possible 
leftshifts. 


11.3.3 Unpacking Sequence 


The unpacking subsequence in Fig. 11.1] performs four tasks. The first task 
carries out the leftshift as described by the following conditional micro-statement, 


IF (SH=0) THEN (В<-3 shl В) ELSE (AB-—3 shl AB) 


and decrements counter Cl until it reaches 0. When counter Cl becomes 0, register 
UNPACK is reset to 0. The second task is to read a word out of the input buffer 
located by register X2 into subregister B(IN); this is controlled by counter C2. When 
counter C2 reaches 0, the reading of the word from the input buffer is carried out. 
The third task is to store a machine instruction assembled in register А into the 
output buffer located by register X3; this is controlled by counter C3. When counter 
C3 reaches 12, the storing of the assembled instruction in register À is carried out. 
This can logically occur only after C1 becomes zero. The fourth task is waiting. 
As shown in Fig. 11.11, there is a waiting loop in which the UNPACK subsequence 
constantly examines register UNPACK and waits for its contents to become 0. 

It should be noted that to call the UNPACK subsequence, register UNPACK 
should be set to 1, register SH should be set to 0 or I, and counter Cl should be set 
to the certain initia] value. 


11.3.4 Address Modification Sequence 


The address modification sequence in Fig. 11.12 performs the operations speci- 
fied by the OP and II fields on the operands in the FLD and INC fields. If the OP 
field contains 1, there is no address modification. If the OP field contains 2, the 
address in subregister B(ADR) is incremented by the subprogram address in register 
X7. Whether the OP field is 1 or 2, the contents of the FS field are transferred to 
counter СТ. 

If the OP field contains 3, the index in subregister B(ADR) is incremented by 
the location of the common symbol table in register X4. The word is read out of this 
memory location and stored in register X8. If the OP field contains 4, the index in 
subregister B(ADR) is incremented by the location of the undefined symbol table in 
register X5. At this location is another address. This address is read out of the memory, 
stored in register X8, and incremented by the location of the defined symbol table in 
register X6. Then, the contents of this memory location are read out of the memory 
and stored in register X8. 

If the OP field contains 3 or 4 and if the II field is not 0, an addition or a sub- 


Sec. 11.4 Microprogram Control 513 


traction is required. The contents of the INC field are first shifted into subregister 
B(ADR) and are then added (if II is 1) to or subtracted (if II is 2) from the contents 
in register X8 with the result stored in subregister BADR). The subtraction is per- 
formed by addition of 2’s complement of the subtrahend. Whether the OP field is 3 
or 4, the contents of the FS field are transferred to counter СІ. 

At this point, the address modification sequence is complete and it returns to 
the fetch sequence. 


11.4 Microprogram Control 


The loader is now to be implemented with microprogram control. In this section, 
the control configuration, the timing and control signals, and the control word format 
are described. The microprogramming is to be presented in the next section. 


11.41 Control Configuration 


Figure 11.13 shows the configuration for microprogram control. Control memory 
CM has a capacity of 256 36-bit words with address register CAR and buffer register 
Е. The 8-bit register RETURN stores a control memory address for “subroutine” 
return. The four-phase clock Р(0-3) in conjunction with the single-bit registers RUN 
and C and the 4-bit register MC generates the control signals. Switch START initiates 
the operation. The above configuration is now described by the following CDL 
statements: 


Comment, microprogram control configuration (11.2) 
Register, CAR(1-8), $control memory address register 
Е(1-36), $control word register 
RETURN(1-8), $micro-subroutine return register 
MC(0-3), $register for sequencing main memory cycle 
D, $memory cycle wait register 
RUN, $start-stop register 


Subregister, F(ADS)=F(1-8), $address portion of the control word 
Memory, CM(CAx)=CM(0-255, 1-36), 

Switch, START(ON), $start switch 

Clock, P(0-3), $four-phase clock 


11.4.2 Timing and Control Signals 


Each main memory cycle is chosen to consist of four control memory cycles, 
and each control memory cycle coincides with one clock cycle. Therefore, there are 


514 Chap. 11 MICROPROGRAMMING SOFTWARE 


RETURN(1-8) CAR(1-8) 


Control memory, 
CM(CAR)=CM(0-255, 1-36) 


Decoders 


Control 
logic network 


MC(0-3) 


Control signals 


Fig. 11.13 Microprogram control configuration 


four steps in each control memory cycle and 16 steps in each main memory cycle. 


The control signals for these 16 steps are described by the following sequence of 16 
labels: 


Comment, control signals in a main memory cycle (11.3) 
/MC(0)*P(0)*RUN/ $beginning of a main and a control memory cycle 
/MC(0)*P(1)*RUN/ 

/MC(0)*P(2)x RUN/ 

/MC(0)*P(3)*RUN/ $end of a control memory cycle 
/MC(1)*P(0)*RUN/ $beginning of a control memory cycle 
/MC(1)*P(1)*RUN/ . | 

/MC(1)*P(2)*RUN/ 


/MC(1)*P(3)*RUN/ $end of a control memory cycle 


Sec. 11.4 Microprogram Control 515 


/MC(2)*P(0)*RUN/ Sbeginning of a control memory cycle 
/MC(2«P(D«RUN/ 

/MC(2)*P(2)*RUN/ 

/MC(2)*P(3)*RUN/ $end of a contro] memory cycle 
/MC(3)*P(0)*RUN/ Sbeginning of a control memory cycle 
/MC(3)*P(1)*RUN/ 

/MC(3)*P(2)*RUN/ 

/MC(3)*P(3)*RUN/ D<0, $end of both memory cycles 


In the above labels, the four steps in each control memory cycle are controlled by 
the four phases of clock P(0—3), and the four control memory cycles in each main 
memory cycle are controlled by the four states of ring counter MC(0-3). Register 
RUN is employed to activate the control signals for the 16 steps in a main-memory 
cycle. 

During each main memory cycle, an instruction is read out of or written into 
the main memory. It is now specified that the transfer of the main memory address 
to register AR and the initiation of the main memory read or write must occur during 
the second step (1.е., /MC(0)*P(1)*RUN/). For a read operation, the word is available 
at buffer register SR during the sixth step (i.e., /MC(1)*P(1)*RUN/). For a write 
operation, the word to be stored into the memory is transferred into buffer register 
SR before the twelfth step (/MC(2)*P(3)*RUN/). 

If some micro-operations occur in every control memory cycle, the following 
sequence of four labels is used: 


Comment, control signals in a control memory cycle (11.4) 
/P(O&«RUN*D'/ F—CM(CAR) Sbeginning of a control memory cycle 
/P(1D)»RUN*D// 

/P(2)*RUN*D‘/ 

/P(3*RUN*D// $end of a control memory cycle 


In the above, register D is used to control the advance or stop of the four steps in 
a control memory cycle. When register D contains a 0, the sequence of the labels 
exists; otherwise, it disappears. 

During each control memory cycle, a micro-instruction is read out of the control 
memory. It is now specified that the transfer of the control memory address to register 
CAR and the initiation of the control memory read must occur during clock phase 
P(3) of the preceeding control memory cycle, and the control word becomes available 
at buffer register F during the first clock phase P(0) of the current control memory 
cycle. Micro-operations activated by the micro-instruction in register F are executed 
during clock phases Р(1-3) of the current control memory cycle. 

Register D is automatically set to zero at the end of each main memory cycle 
(/MC(3)*P(3)*RUN/). Thus, when it is required to wait for the beginning of the 


516 Chap.11 MICROPROGRAMMING SOFTWARE 


main memory cycle, register D is set to | to stop generation of the control signals 
during a control memory cycle, but the control signals for the main memory cycle 
continue. If a micro-instruction is fetched at the beginning of a main memory cycle 
and register D is set to 1 at the same time, then the micro-instruction remains in 
register F for one main memory cycle, as will be described later. 


EN 


11.4.3 Control Word Format 


Table 11.1 shows the control word format. The 36 bits of each control word in 
register F are divided into three groups: 


1. Field F(1-8) which contains a control memory address, 


2. Field F(9-23) which is divided into five subfields with a decoder attached to each sub- 
field, 


3. Field F(24-36) where each bit controls one micro-operation or a group of micro- 
operations. 


There аге 28 control bits in field Е(9-36) which control 43 execution statements. 

Field F(1-8) provides a two-way branch to each micro-instruction. The five 
subfields in field F(9—23) аге: Е(9-11), F(12-13), Е(14-16), Е(17-20), and Е(21-23). 
Each subfield controls micro-operations whose occurrences are mutually exclusive. 
Field F(9-11) controls micro-operations which initialize the three counters. Field 
F(12-13) controls counting micro-operations. Field F(14-16) controls micro-opera- 
tions which load address register AR. Field F(17-20) controls micro-operations 
which involve storage register SR and register B. Field Е(21-23) controls micro- 
operations which set up the control memory address in register CAR. 

The 13 control bits in field F(24—36) control the remaining micro-operations. 
Note that bit F(24) controls the shift and test micro-operations involving register B 
and casregister AB in two clock phases P(1) and P(3). 


11.5 Microprogramming 


This section shows the microprogramming of the loader. The microprogram is 
first shown in the CDL statements from which the microprogram in 1’s and O's is 
obtained. The microprogram in the CDL statements is obtained from the sequence 
charts in Figs. 11.9 to 11.12 and the control word format in Table 11.1. As will be 
shown, the microprogram consists of 24 micro-instructions: three for the unpacking 
subsequence, two for the initialization sequence, two for the fetch sequence, and 16 
for the address modification sequence. 


11.5.1 Unpacking Sequence 


- 


The three micro-instructions for this sequence аге described below. 


Comment, unpacking sequence 5 (11.5) 


Sec. 11.5 Microprogramming 


517 


TABLE 11.1 Control Word Format 


CONTROL 


Bits 


F(1-8) 
F(9-11) 


Е(12-13) 


Е(14-16) 


F(17-20) 


F(21-23) 


F(24) 


F(25) 


DECODER 


DC(0-7) 
DC(1) 
DC(2) 
DC(3) 
DC(4) 
DC(5) 
DC(6) 
DX(0-3) 
DX(1) 
рх) 
DX(3) 
DAR(0-7) 
DAR(1) 
DAR(2) 
DAR(3) 
DAR(4) 
DAR(S) 
DBS(0-15) 
DBS(1) 
DBS(2) 
DBS(3) 
DBS(4) 
DBS(5) 
DBS(6) 
DBS(7) 
DBS(8) 
DT(0-7) 
рта) 
DT(2) 
DT(3) 
DT(4) 
DT(5) 


DT(6) 


DT(7) 


CONTROL 
SIGNAL 


D’*P(1) 
D'xP(1) 
D'«P(1) 
D'«P(1) 
D'xP(1) 
D'«P(1) 


D'«P(2) 
D'«P() 
D'«P(2) 


MC(0)*P(1) 
MC(0)*P(1) 
МС(0)*Р(1) 
MC(0)#P(1) 
MC(0)*P(1) 


D'«P(2) 
MCQ)«P(1) 
MCQ)«P(1) 
D'«P(1) 
D'«P(1) 
D'«P(1) 
D'«P(1) 
D’*P(1) 


D'«P(3) 
D'«P(3) 
D'«P(3) 
D'«P(3) 
D'«P(3) 
D'«P() 


MC(2)*P(3) 
MC(3)*P(3) 


D'«P(l) 


D'«xP(3) 


D'«P(1) 


MICRO-OPERATIONS 


Control memory address field 


C1 —FS, 
C1—3, 
C1«—5, 
С2<-12, 
С3<-0, 
C3—7, 


X1«—countup ХІ, 
X2«—countup X2, 
X3«—countup X3, 


AR—X2, 
AR<X3, 
АК<—Х8 add X4, 
АК <Х8 add X5, 
AR<-X8 add Хб, 


SR«—A, 

B(IN)——SR, 
X8«—SR(AD), 
X1<—B(ADR), 
B(ADR)<-X7, 
X8<—B(ADR), 
B(ADR)<—X8 add ХІ, 
B(ADR)«—X8 add X7, 


CAR<countup CAR, 

CAR-«—F(ADS), 

CAR —F(1-5)-OP, 

CAR-«—RETURN, 

IF(II—0) THEN (CAR ——F(ADS)) ELSE 

(CAR ——countup CAR), 
IF(II—-I) THEN (CAR<—F(ADS)) ELSE 
(CAR —countup CAR), 

CAR —countup CAR, 

IF(C140) THEN (CAR —F(ADS)), 

IF ((C1 =0)*(C3412)) THEN (CAR-—RETURN), 

IF(SH =0) THEN (B-—3 shl B) ELSE (AB«—3 shl АВ), 

IF(SH—1) THEN (C3<—countup СЗ), 

Cl<countdn СІ, 

C2«—countdn C2, 

IF(C2=0) THEN (CAR —countup CAR, 
IF (MC(0) 4-MC(1) + МС(2)=1) THEN (D<—1)), 

IF((C1 =0)*(C240)*(C3 = 12)) THEN (CAR ——F(ADS), 
IF (MC(0) 4-MC(1)3-MCQ)-—1) THEN (D—1)), 

IF((CI =0)*(C240)*(C3412)) THEN (CAR«—RETURN), 

OP-—B(1-3), FS«— B(4-7), II-— B(8-9), 


518 

TABLE 11.1 

CONTROL CONTROL 
Bits DECODER SIGNAL 
F(26) D'«P(2) 
D'«P(3) 

F(27) D'xP(1) 
F(28) D'«P(1) 
F(29) Г/жР(1) 
F(30) 
Е(31) МС(0)*Р(1) 
F(32) D'«P(2) 
F(33) D'«P() 
F(34) М.С(О)жР(1) 
Е(35) D'«P(2) 
F(36) D'xP(3) 


Chap. 11 MICROPROGRAMMING SOFTWARE 


Control Word Format (Contd.) 


MICRO-OPERATIONS 


RETURN-—CAR, 
RETURN<countup RETURN, 
A<-0, 

х1—0 

Х1«<-ХІ” 

(not used) 

READ -—1, 

SH—0, 

IF(MC(0)+MC(1) +MC(2)=1) THEN (D1), 
WRITE-—1, 

SH-—1, 

RUN -—90, 


Comment, shift-and-test micro-instruction located at CM address 63 


/[D'«RUNxP(0)/ 


/D'*RUN#P(1)#F(24)/ 


/D'* RUN *P(3)*«F(24)/ 


F—CM(CAR), 

IF (SH=0) THEN (B—3 shl B) ELSE (AB<3 
shl AB), 

ТЕ (SH=1) THEN (C3<countup СЗ), 

Cl<countdn СІ, C2<—countdn C2, (11.6) 


IF (C2=0) THEN (CAR —countup CAR, 
IF (МС(0)--МС(1)--МС(2)--1) 
THEN (D-1)), 
IF ((C1—0)«(C2220)«C3—12)) 
THEN (CAR-—F(ADS), 
IF (MC(0)--MC(1)--MC(2)-1) 
THEN (D<1)), | $F(ADS)=65 
IF ((C1=0)*(C240)#(C3412)) THEN (CAR 
—RETURN) 


Comment, load a MM-word micro-instruction located at CM address 64 (11.7) 


[D'«RUNsP(0)/ 


F-—CM(CAR), 


/RUN«MC(O«P(I)&«DAR(I) | AR<X2, 
JRUN«MC(O)&P(I)«F(31)/ 
/D’*RUN#P(1)*DC(4)/ 
/D't RUN«P(2)*DX(2)/ 


READ-:-1, 
C2—12, 
X2-—countup X2, * 


Sec. 11.5 Microprogramming 


[D'«RUNsP(3)«F(33)/ 


519 


IF (MC(0)--MC(1)2-MC(2)—1) THEN (D-1), 


Comment, this micro-instruction remains in F until the end of the MM cycle 


/RUN*MC(1)*P(1)/ 

/RUN*MC(2)*P(1)DBS(2)/ 
/RUNs*MC(2)«P(3)*#DT(7)/ 
/RUN*MC(3)*P(3)*DT(7)/ 


/RUN*MC(3)*P(3)/ 


SR—M(AR), 
B(IN)—SR, 
CAR —countup CAR, 
IF (C140) THEN (CAR —F(ADS)), 
$F(ADS)=63 

ТЕ (CL=0)#(C3412)) 

THEN (CAR-—RETURN), 
D0, 


Comment, store a MM-word micro-instruction located at CM address 65 (11.8) 


/D'+RUN=P(0)/ 
/RUN*MC(0)*P(1)*DAR(2)/ 
/RUN*MC(O)«P(1)«F(34)/ 
/D'* RUN«P(1)*DC(5)/ 

/D'x RUN«P(2)*#DX(3)/ 

/D’* RUN«*P(2)*DBS(1)/ 
/D’*RUN#P(3)*DT(4)/ 
/RUN*MC(3)«P(1)/ 


Е<-СМ(САК), 
AR<X3, 
WRITE<1, 
C3<—0, 
X3<countup X3, 
SR<A, 
CAR<RETURN, 
M(AR)-—SR, 


In the above description, register UNPACK in Fig. 11.11 is replaced by register 
RETURN. When the unpacking sequence is called, the next control memory address 
is stored in register RETURN, and the transfer from the calling sequence to the 
unpacking sequence is carried out by micro-operation CAR<-F(ADS). Subregister 
F(ADS) contains the address of the unpacking sequence in the control memory. 
When the unpacking sequence is terminated, the return to the calling sequence is 
performed by micro-operation CAR-—RETURN. 

As is shown above, the first micro-instruction is located at control memory 
address 63. As shown in the sequence chart of Fig. 11.11, in addition to shifting 
register B of casregister AB, this micro-instruction performs a four-way branch as 


below. 


1. If condition (C140)*(C240) is true, then repeat the shift-and-test micro-instruction ; 

2. If condition (C2=0) is true, then a word in the input buffer is read out of the main 
memory and stored in subregister B(IN); 

3. If condition (СІ =0)*(C240)*(C3=12) is true, then the contents in register А are 
written into the output buffer in the main memory; 


4. If condition (C1 =0)*(C240)*(C3412) is true, then the unpacking subsequence is 
terminated, and the control is returned to the calling sequence. 


520 Сһар.11 MICROPROGRAMMING SOFTWARE 
If the branch 2 or 3 is performed, the wait register D is set to 1 because the micro- 
instructions to be executed require one main memory cycle. 

The second micro-instruction is located at control memory address 64. It is 
fetched and executed if condition (C2=0) of the shift and test micro-instruction is 
true. This micro-instruction reads a word from the input buffer in the main memory 
and stores it into subregister B(IN). Тһе micro-instruction requires one main memory 
cycle. To accomplish this, wait register D is set to 1, causing the micro-instruction 
to remain in register F for a full main memory cycle. During this control cycle, the 
write to the output buffer is initiated, and it is completed at three control memory 
cycles later. Again, the micro-instruction returns control to the calling sequence. 


11.5.2 Initialization Sequence 


The initialization sequence initializes the loader. The three micro-instructions 
that make up the sequence are shown below. 


Comment, initialization sequence (11.9) 


Comment, initiate input-buffer read micro-instruction at CM address 66 (11.10) 


/D'*RUN*P(0)/ F-——CM(CAR), 
/RUN*«MC(O«P(I)«DAR(I) AR-—X2, 
/RUN*MC(0)*P(1)*F(31)/ READ<1, 

/D'* RUN«*P(1)*DBS(5)/ B(ADR)<X7, 
/D’*RUN«#P(1)*DC(6)/ C3—7, 
/D'«sRUNsP(I)«F(27)/ А<0, 
/D'«RUN*P(3)«DT(1)/ CAR —countup CAR, 


Comment, main-memory-read micro-instruction located at CM address 67 (11.11) 


/D'«RUNxP(0)/ F-—CM(CAR), 

/RUN*MC(1)*P(0)/ SR<M(AR), 

/D'«RUNsP(1)«DC(4)/ C2—12, 

/D’x RUN#P(2)*DX(2)/ X2<—countup X2, 
/D'«RUNsP(3)«DT(1)/ CAR —countup, 

Comment, load register B micro-instruction located at CM address 68 
/D'«RUN*xP(0)/ F-——CM(CAR), 
/RUN*«MCQ)«P(ID&DBS(2)  BüN)—SR, 

/D’*RUN#P(1)*DC(3)/ C1—5, - 

/D'«RUNxP(3)«DT(1)/ CAR —countup CAR, 


Sec. 11.5 Microprogramming 521 


The three micro-instructions located at control memory addresses 66, 67, and 
68 execute sequentially during the first three control memory cycles of a main memory 
cycle. They initialize the contents of registers A and B and set the counters for the 
fetch sequence. The third micro-instruction increments control memory address 
register CAR to begin the fetch sequence. 


11.5.3 Fetch Sequence 


The fetch sequence fetches the next relocatable word. The two micro-instructions 
of the fetch sequence are: 


Comment, fetch sequence (11.12) 

Comment, initiate-unpacking-sequence micro-instruction at (11.13) 
CM address 69 

/D'x RUN#P(0)/ F<-CM(CAR), 


/D'«RUNsP(2&F(35) SH<1, 

/D'* RUN#P(2)*F(26)/ RETURN<CAR, 

/D'*RUN*P(3)*F(26)/  RETURN-—countup RETURN, 

/D'sRUN*P(3«DT(2/ CAR<F(ADS), $F(ADS)=63 

Comment, decode and initiate-unpacking-seq. micro-instruction at (11.14) 

CM address 70 

/D'«RUN*P(0)/ F<CM(CAR), 

/D'«RUN*P(I)«F(25) | OP—B(1-3), FS—B(4-7), П--В(8-9), 

/D'«sRUNsP(1)&DC(2/ C1«—3, 

/D'«sRUN«P(2)&«F(32/ SH<0, 

/D‘'*RUN#P(2)*F(26)/ RETURN<CAR, 

/D'sRUN#P(3)#F(26)/ RETURN<countup RETURN, 

/D'«sRUNxP(3*DT(2/ CAR<F(ADS), $F(ADS)=63 

As shown above, the first micro-instruction is located at control memory address 
69. This micro-instruction performs the transfer from the fetch sequence to the 
unpacking sequence with the indication to perform a shift of casregister AB. This is 
accomplished by loading register RETURN with the address of the next micro- 


instruction, setting register SH to 1, and loading the control address register with 


the address of the unpacking sequence. 

The second micro-instruction is executed on returning from the unpacking 
sequence. This micro-instruction loads the registers OP, FS, and II. It then calls the 
unpacking sequence with register SH set to zero in order to left-adjust the address 


522 Chap. 11 MICROPROGRAMMING SOFTWARE 


part or data part of the relocatable word in register В. It loads the register RETURN 
with the address of the next micro-instruction, which is the first micro-instruction 
of the address modification sequence. 


11.5.4 Address Modification Sequence 


The address modification sequence is described in the form of a sequence chart 
in Fig. 11.13. The CDL description of the sequence appears below. ` 


Comment, Address modification sequence (11.15) 


Comment, branch оп OP micro-instruction located at CM address 71 (11.16) 


/D’*RUN«P(0)/ F-—CM(CAR), 

/D’x RUN«P(1)*DBS(6)/ X8<—B(ADR), 

/D’xRUN«P(3)*DT(3)/ CAR-—F(1-5)-OP, $F(ADS)=72 

/D'«xRUNsP(3)«F(33)/ IF (MC(0)+MC(1)+MC(2)=1) THEN (D- 1), 

Comment, error-stop micro-instruction located at CM address 72 (11.17) 

/D’x*RUN«P(0)/ Е<--СМ(САК), 

/D'«RUNsP(3)«F(36)/ RUN —0, 

Comment, data micro-instruction located at CM address 73 (11.18) 

/D’*RUN#P(0)/ F<—CM(CAR), 

/D'«RUN*P(1)«DC(1)/ CI —FS, 

/D’x RUN#P(3)*DT(2)/ CAR<F(ADS), $F(A DS) =69 

Comment, relative-address micro-instruction located at CM address 74 (11.19) 

/D'«RUNsP(0)/ F-—CM(CAR), 

/D'«RUNsP(1)«DBS(8)/ B(ADR)-—XS add X7, 

/D'«RUNsP(1)«DC(1)/ CI —FS, 

/D’*RUN«P(3)«*DT(2)/ CAR<-F(ADS), $F(ADS)=69 

Comment, common-area-address micro-instruction located (11.20) 
at CM address 75 

/D’*RUN#P(0)/ F<-CM(CAR), 


/RUN«#MC(0)*P(1)*DAR(3)/ AR<X8 add X4, 
/RUN+*MC(0)#P(1)#F(31)/  READ<1, 
/D’xRUN#P(3)*DT(2)/ CAR-—F(ADS)  $F(ADS)=81 


Comment, external-address micro-instruction located at CM address 76 (11.21) 


Sec. 11.5 Microprogramming 523 


[D'*«RUNsP(0)/ F——CM(CAR), 
/RUN«MC(OP(I)&«DAR(4) AR-—XS8 add X5, 
/RUN*MC(O)P(I«F(31/ ^ READ-1, 


/D'«RUNsP(3)&DT(2)/ CAR —F(ADS), $F(ADS)=80 

/D'«RUNsP(3)«F(33)/ IF (MC(0)+MC(1)+-MC(2)=1) THEN (D<-1), 

/RUN*MC(1)*P(1)/ SR-—M(AR), 

/RUN*MC(2)«P(1)«DBS(3/ Х8<-5В(АО), 

/RUN*MC(3)«P(3)/ D0, 

Comment, stop micro-instruction located at CM address 77 (11.22) 

/D'*RUN=P(0)/ F—CM(CAR), 

/D'*RUN=P(3)+F(36)/ RUN-9, 

Comment, error-stop micro-instruction located at CM address 78 (11.23) 

/D’xRUN#P(0)/ F«—M(CAR), 

/D’*RUN#P(3)*F(36)/ RUN~<, 

Comment, error-stop micro-instruction located at CM address 79 (11.24) 

/D’*RUN#P(0)/ F<-CM(CAR), 

/D'«RUNxP(3)«F(36)/ RUN<, 

Comment, main-memory-table read micro-instruction located (11.25) 
at CM address 80 

/D'«sRUNsP(0)/ F-——CM(CAR), 


/RUN«MC(O)&P(1)&DAR(5S) АВ<_Х8 add Хб, 
/RUN«MC(O&P(I&F(3)/ | READ-1, 


/D'«RUNsP(3)«DT(1)/ CAR —countup CAR, 
Comment, branch-if-no-increment micro-instruction located (11.26) 
at CM address 81 
/D'«RUNxP(0)/ F«-—CM(CAR), 
/RUN*MC(1)«P(1)/ SR<M(AR), 
/D'«RUNsP(3)«DT(5)/ IF (11-0) THEN (CAR-—F(ADS)) 
ELSE (CAR —countup САК), 
$F(ADS)=86 
Comment, initiate-unpacking-sequence micro-instruction located (11.27) 


at CM address 82 


524 Chap.11 MICROPROGRAMMING SOFTWARE 


/D'*RUN*«P(0)/ F<CM(CAR), 
/RUN*MC(2)*P(1)*DBS(3)/ X8<SR(AD), 
/D'«RUN*P(I)«DC(1)/ CI —FS, 
[D'«RUN*PQ)«F(32)/ SH—0, 
[D'sRUN«PQ)FQ6/ `- . RETURN<CAR, 
[D'«RUNsP(3)«F(26)/ RETURN-—countup RETURN, 
[D'«xRUNsP(3)«DT(2)/ CAR-—F(ADS), $F(ADS)=63 
Comment, increment-decrement branch micro-instruction located (11.28) 
at CM address 83 
[D'«RUNxP(0)/ F<-CM(CAR), 
/D’*RUN*P(1)*DBS(4)/ X1-——B(ADR) 
/[D'«RUNxP(3)«DT(6)/ IF (П--1) THEN (CAR-—F(ADS)) 
ELSE (CAR —countup CAR), 
$F(ADS)=85 
Comment, 2’s-complement micro-instruction located at CM address 84 (11.29) 
/D’*RUN#P(0)/ F<—CM(CAR), 
/D'«RUNxP(1)«F(29)/ XlI—XI', 
/D’*RUN«#P(2)*DX(1)/ X1<Countup ХІ, 
/D'«sRUNsP(3)«DT(1)/ САК -—countup CAR, 
Comment, modify absolute address micro-instruction located (11.30) 
at CM address 85 
/D'«RUN*xP(0)/ F-—CM(CAR), 
/D’* RUN#*P(1)*DBS(7)/ B(ADR)<X8 add ХІ, 
/D'sRUNsP(1)«DC(1)/ CI—FS, 
/D'«RUN*P(3)«DT(2)/ CAR-—F(ADS), $F(ADS)=69 
Comment, no-increment micro-instruction located at CM address 86 (11.31) 
/D’*RUN«P(0)/ F-—CM(CAR), 
/RUN«MC(2&P(1)»DBS(3/ .X8—SR(AD), 
/D’* RUN«P(1)*F(28)/ Х1--0, 
/D'«sRUN*P(3)DT(2)/ CAR-—F(ADS), $F(ADS)=85 


Micro-instruction (11.16) is located at contro] memory address 71. This micro- 
instruction performs an eight-way branch on the contents of register OP. This is 
accomplished by concatenating the first five bits of subregister F(ADS) with register 


Sec. 11.5 Microprogramming 525 


OP to form an eight-bit control memory address. Of the eight possible addresses, 
five are legitimate and will be discussed in detail subsequently. The other three, 
corresponding to OP values 0, 6, and 7, cause an error-stop. The subregister F(ADS) 
contains the control memory address 72. This means that the eight addresses possible 
are 72 through 79, with addresses 72, 78, and 79 corresponding to the illegitimate 
addresses. It should be noted that register X8 is loaded by this micro-instruction. 

If OP is equal to 1, the micro-instruction (11.18) at control memory address 73 
is executed. As this means that the relocatable word contains data, the micro-instruc- 
tion simply loads counter Cl and branches to the fetch sequence. 

If OP is equal to 2, it indicates relative address; the micro-instruction (11.19) 
at control memory address 74 is executed. This micro-instruction adds the address 
in X8 to the subprogram address in X7 and branches to the fetch sequence. 

If OP is equal to 3, it indicates a common area address; the micro-instruction 
(11.20) at control memory address 75 is executed. This micro-instruction initiates 
the read of the Common Symbol Table in the main memory and branches to the 
micro-instruction at location 81 to test for the existence of an increment field in the 
relocatable word. It should be noted that micro-instruction (11.16) sets wait register 
D in order to assure that this instruction is performed at the beginning of a main 
memory cycle. 

If OP is equal to 4, it indicates an external address; the micro-instruction (11.21) 
at control memory address 76 is executed. This micro-instruction performs a read of 
the undefined symbol table in the main memory and then branches to the micro- 
instruction at control memory address 80 which initiates the read of the defined 
symbol table in the main memory. 

If OP is equal to 5, it indicates the termination of the translation process; the 
micro-instruction (11.22) at control memory address 77 is executed. This micro- 
instruction sets register RUN to zero to stop generation of all signals. 

The remaining micro-instructions in control memory address locations 81 through 
86 are executed if OP is equal to 3 or 4. These micro-instructions perform the test for 
the increment or decrement and the address modification operations described in the 
sequence chart in Fig. 11.12. This entails initializing the unpacking sequence in the 
micro-instruction (11.27) at control memory address 82. It should also be noted that 
the main memory read initiated by the micro-instruction located at the control 
memory address 75 (OP — 3) or 80 (OP — 4) is continued in parallel with the execu- 
tion of these micro-instructions. 


11.5.5 Microprogram 


The microprogram is shown in Table 11.2 where the micro-instructions are 
stored at the arbitrarily chosen locations 63 to 86. Each micro-instruction of the 
microprogram is obtained from the previous CDL descriptions of the micro-instruc- 
tions. | 

For example, the micro-instruction (11.7) for loading а main memory word 
described previously in the unpacking sequence is assigned to location 64. The field 


526 Chap. 11 MICROPROGRAMMING SOFTWARE 
TABLE 11.2 The Microprogram of the Loader 

CONTROL 

MEMORY DC DX DAR DBS DT 

ADDRESS F(ADS)=F(1-8) (0-7) (0-3) (0-7) (0-15 (0-7) Е(24-36) 
63 01000001 000 00 000 0000 000 1000000000000 
64 00111111 100 10 001 0010 111 0000000101000 
65 00000000 101 11 010 0001 100 0000000000100 
66 00000000 110 00 001 0101 001 0001000100000 
67 00000000 100 10 000 0000 001 0000000000000 
68 00000000 01100000 0010 001 0000000000000 
69 00111111 000 00 000 0000 010 0010000000010 
70 00111111 010 00 000 0000 010 0110000010000 
71 01001000 000 00 000 0110 011 0000000001000 
72 00000000 000 00 000 0000 000 0000000000001 
73 01000101 001 00 000 0000 010 0000000000000 
74 01000101 001 00 000 1000 010 0000000000000 
75 01010001 000 00 011 0000 010 0000000100000 
76 01010000 000 00 100 0011 010 0000000101000 
77 00000000 000 00 000 0000 000 0000000000001 
78 00000000 000 00 000 0000 000 0000000000001 
79 00000000 ооо оо ооо 0000 ооо 0000006000001 
80 00000000 000 00 101 0000 001 0000000100000 
81 01010110 000 00 000 0000 101 0000000000000 
82 00111111 001 00 000 0011 010 0010000010000 
83 01010101 000 00 000 0100 110 0000000000000 
84 00000000 000 01 000 0000 001 0000010000000 
85 01000101 001 00 000 0111 010 0000000000000 
86 01010101 000 00 000 0011 010 0000100000000 


F(ADS) should contain 63 because of the branch micro-operation CAR<-F(ADS) 
in the micro-instruction. In each of the execution statements, there is one micro- 
operation and a corresponding decoder terminal or control bit in the label according 
to the control word format in Table 11.1. Consider the second execution statement of 
this example. This statement contains micro-operation AR«—X2. According to the 
control word format, this micro-operation is controlled by decoder terminal DAR(1) 
in field Е(14-16); therefore, this field, as shown at location 64 of Table 11.2, contains 
001. Similarly, each non-zero subfield controls the execution of one micro-operation 
or a group of micro-operations by the terminal. Consider also the micro-operation 
for initiating a memory read READ -—1. According to the control word format, this 
micro-operation is controlled by control bit F(31). The control word at location 64 
in Table 11.2 shows that this bit is 1. Similarly, each control bit in field Е(24-36) 
controls the execution of one micro-operation or a group of micro-operations. In 
this manner, the microprogram in Table 11.2 is prepared. 

At location 71 of this microprogram, the micro-instruction branches to one of 
locations 72 through 79. Locations 72 through 79 contain eight micro-instructions 
(only five are used), each of which performs the micro-operations required by the OP 


Problems 527 


code of the relocatable word. This table is located by the address in field F(ADS) 
(see 01001000 at location 71 in Table 11.2). 

It is possible to estimate the speed of translating the executable code (i.e., instruc- 
tions or data words). For one extreme, it requires seven main memory cycles to 
translate a data word. For the other extreme, it requires 24 main memory cycles to 
translate a two-address instruction where the addresses are external addresses, because 
it requires four relocatable words for such an instruction. Let the main memory 
cycle time be one microsecond. Then, the microprogrammed loader is capable of 
producing from 41,700 to 143,000 instructions or data words per second. 


References 


1. FLoRES, І., Computer Software. Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1965. 


2. HASSETT, A., Computer Programming and Computer Systems. New York: Academic Press, 
Inc., New York, 1967. 


3. Maurer, W. D., Programming: An Introduction to Computer Languages and Techniques. 
San Francisco: Holden-Day, Inc., 1968. 


4, BARRON, D. W., Assemblers and Loaders. New York: American Elsevier Inc., 1969. 
5. ROSIN, R. F., “Supervisory and Monitor Systems,” Computing Surveys of the ACM, 
March, 1969, рр. 37-54. 


6. CHU, Y., PARDO, О. R., and УЕн J., “А Methodology for Unified Hardware-software 
Design,” Technical Report 70-107, Computer Science Center, University of Maryland, 
January, 1970. 

7. CHU, Y., Introduction to Computer Organization. Englewood Cliffs, N.J.: Prentice-Hall, 
Inc., 1970. 


Problems 


11.1. By making use of the technique for formating the control word described in Chapter 8, 
redesign the control word format in Table 11.1 for the purpose of reducing the size 
(i.e., width and length) of the microprogram. 

11.2. Given the control word format in Problem 11.1, prepare a microprogram in CDL for 
the relocatable code translation. 


11.3. Draw a sequence chart for the microprogram described in Problem 11.2. 


А 


Accumulator, 117 
Activity list, 300 
Addressable unit, 263 
Addressing: 
base, 267 
direct, 264 
dynamic, 358 
immediate, 264 
indirect, 264, 295, 357 
register, 268 
relative, 267 
Address translation, 312 
Arithmetic algorithm, 121, 240, 241, 245 
Arithmetic unit: 
binary, 115, 169, 208 
decimal, 230 
fixed-point, 115, 207 
floating-point, 169 
parallel, 115, 169 
serial, 207 
Array organization, 257 
Associative memory, 273, 287 
Asynchronous control, 368 


B 


Bias, 170 
Binary coded decimal numbers, 231 
addition, 232 
representation, 231 
subtraction, 232 
time sequence, 232 
Binary numbers: 


Index 


Binary numbers (cont.) 
fixed-point, 115, 208 
floating-point, 115, 170 
signed, 35 
unsigned, 36 

Bit efficiency, 263 

Booth algorithm, 218 

Bowling-score computer, 71 

Buffer reference miss, 309 

Burst mode, 421, 451 


C 


Carry: 
group, 139, 142, 147 
input group, 141, 149 
input section, 142, 150 
local, 140 
section, 139, 143 
Carry propagate, 139 
group, 142, 149 
section, 143, 150 
Carry propagation, 140 
CDL description: 
nonprocedural, 121 
procedural, 122, 128, 131 
CDL statement: 
array-casregister, 3 
array-register, 4 
block, 196 
casregister, 3 
clock, 6 
comment, 15 
declaration, 2 
decoder, 6 


529 


530 


CDL statement (cont.) 
encoder, 301, 302, 362 
end, 13 
execution, 12 
light, 4 
memory, 4 
register, 2 
subregister, 3 
switch, 4 
terminal, 5, 143 
Central processing unit, 296, 342, 353, 
376, 398 
Channel, 382, 400, 418, 421 
adapter, 423 
command, 455 
multiplexor, 389, 400, 450, 485 
program, 456 
selector, 400, 476 
Channel address word, 454 
Channel command word, 454 
Channel status word, 453, 457 
Characteristic, 171 
Characteristic alignment, 176, 195 
Checker, 324 
Code: 
ACSII, 406 
binary, 57 
condition, 335, 433, 459 
EBCDIC, 405 
external BCD, 404 
Gray, 57 
hexadecimal, 403 
Hollerith card, 408 
internal BCD, 404 
interrupt, 432 
instruction-length, 409 
operation, 424 
relocatable, 496 
Code converter: 
binary-to-decimal, 61 
Gray-to-binary, 57 
Configuration of: 
address translation, 314 
associative memory, 274 
binary-to-decimal converter, 63 
bowling-score computer, 75 
channel, 384, 450, 451, 477 
computer system, 388-91 
decimal arithmetic unit, 239 


* 


INDEX 


Configuration of (cont.) 
decimal multiplier, 248—49, 251 
dynamic loader, 284 
finding the largest number, 44, 48, 105 
Gray-to-binary code converter, 59 
ІВМ System/360 model 40, 412-13 
interface, 380, 465 
I/O control, 445, 448 
loader, 507, 514 
memory buffer, 301 
microprogrammed computer, 91 
paging, 316 
parallel fixed-point 
117, 133, 154 
parallel floating-point arithmetic unit, 
173 
prime number generator, 53 
priority interrupt, 361 
segmentation, 318 
segmented paging, 320 
serial arithmetic unit, 210 
serial comparator, 37 
serial parity generator, 33, 85 
stored carry addition, 68 
stored program computer, 17 
Connect micro-operation, 343 
Control cycle, 19, 88, 354 
Control heirarchy, 338 
Control memory, 83 
Control path, 22 
Control word, 86, 335 
Control word format, 85, 93, 101, 108, 
353, 517 
Crossbar switch, 259, 390 


arithmetic unit, 


D 


Data channel, 447 

Data flow, 411, 477 

Data path, 22 

Decimal-digit adder, 234, 240 
Decimal divider, 251 

Decimal multiplier, 247 
Decoder, 6 

Direct control, 440 

Dividend alignment, 188, 201 
Divide stop, 221 

Double rank register, 151 
Dynamic memory allocation, 312 


INDEX 
E 


Emit field, 341 
Encoder, 301, 302, 362 
End-around carry, 121 
Exponent, 170 
Expression, 8 


Е 


Fetch algorithm, 314, 322 
Floating-point register 426 
Floating-point underflow, 172 
Formats, 18, 86, 116, 171, 208, 297, 354, 
401, 431, 454, 458, 498 
Full adder: 
multiple-bit, 223 
single-bit, 118, 139 
Full adder-subtracter, 209 


G 


Gating, 343 
General register, 426 
Gray code, 57 


Initial program loading, 440 

Input-output control, 445 

Input-output instructions, 426, 460, 481, 
484, 487, 490 

Input-output interface, 385, 422, 462 

Input-output interrupt, 460 

Interrupt, 359, 434 

Interrupt supervisor, 437 

Interval timer, 440 


L 


Label, 12 
Level, 139, 151 
Loader, 284 
Logout, 419 


531 


M 


Match operations: 

Boolean argument, 283 

count argument, 283 

numerical argument, 276 
Memory: 

associative, 273 

buffer, 295 

cache, 296 

random access, 256, 261 

stack, 269 

virtual, 311 
Micro-instruction, 86, 160, 335 
Micro-operation, 8, 23, 97 
Microprogram, 437 
Microprogram of: 

find the largest number, 111 

fixed-point arithmetic unit, 164 

loader, 526 

microprogrammed computer, 96 

parity generator, 86 

stored logic computer, 104 
Microprogram control, 83, 335 
Microprogram control configuration of: 

control unit, 336 

CPU, 342, 346 

finding the largest number, 105 

fixed-point arithmetic unit, 154 

loader, 513 

MICU, 351 

serial parity generator, 84 

stored logic computer, 98 

stored program computer, 90 

two level hierarchy, 339 
Microprogram description, 87, 93, 102, 

160, 513-24 

Microprogrammed computer, 83, 90 
Microprogrammed control unit, 335 
Microprogrammed CPU, 342 
Microprogrammed I/O control unit, 349 
Microprogramming, 83, 146, 154 

horizontal, 341 

vertical, 341 
Microprogramming software, 495 
Micro-statement, 7 

conditional, 10 
Modular organization, 389 
Module organization, 259 


532 


Multiple access organization, 260 
Multiplexing, 389 

Multiplex mode, 421, 451 

Multiplexor channel, 389, 400, 450, 485 
Multiplier-quotient register, 117 


N 


Negative zero, 116 
Non-restoring algorithm, 221 
Normalization, 172, 180, 185, 197, 200 
Normal zero, 172 
Numerical match operation: 

between limits, 279 

larger than, 276 

maximum, 279 

minimum, 279 

smaller than, 276 


О 


Operating system, 400 

Operator: 
arithmetical, 7 
basic, 7 
binary, 8 
functional, 7 
logical, 7 
special, 11, 24, 118, 205, 236, 238 
unary, 8 

Overflow: 
addition, 121, 213 
division, 129, 221 
floating-point, 171 


P 


Page table, 315 

Paging, 315 

Parallel adder, 5, 118, 139, 143, 174, 342 
Parallel subtracter, 42, 46 

Parity generator, 32, 84 

Placement algorithm, 315, 322 
Polling, 466 

Positive zero, 116 

Prime number generator, 50 
Privileged instructions, 430 
Program status word, 431, 454, 459 


INDEX 


R 


Radix, 170 
Replacement algorithm, 314, 322 


5 


Scheduling, 322 
Segmentation, 317 
Segmented paging, 319 
Selector channel, 400, 476 
Self-complementing code, 232 
Sequence: 
addition-subtraction, 122, 135, 160, 
176, 215, 240 
division, 131, 138, 160, 186, 224, 245 
multiplication, 128, 136, 162, 181, 219, 
241 
Sequence chart of: 
binary-to-decimal converter, 66 
bowling-score computer, 77 
channel, 387 
CPU, 377 
decimal arithmetic unit, 242-44, 246- 
47 
dynamic loader, 290-91 
finding the largest number, 45, 49 
Gray-to-binary converter, 61 
IOU, 374 
loader, 505-10 
LSU, 386 
match operations, 278, 281 
memory buffer, 305-7 
MU, 371 
parallel fixed-point arithmetic unit, 
122, 123, 125, 129, 130 
parallel floating-point arithmetic unit, 
176-79, 181-92 
prime number generator, 54 
serial arithmetic unit, 215-18, 220, 
225-27 
serial comparator, 39, 41 
serial parity generator, 34 
stored carry addition, 69 
stored program computer, 21 
Sequence control, 466 
Sequencing, 12, 294, 353, 429 
Sequential logic control, 83, 330 


INDEX 


Serial comparator, 35 
Serial Substracter, 47, 54 
Shift counter, 117 

Stack organization, 269 
Statement description of: 


533 


CPU (cont.) 


serial parity generator, 33, 35, 84, 87 
stored carry addition, 69 

stored logic computer, 98, 102 

stored program computer, 20-21, 24- 


binary-to-decimal converter, 62, 66 
bowling-score computer, 76, 79, 80 
channel, 383 


CPU, 376, 378, 379 


decimal arithmetic unit, 240 

dynamic loader, 285, 292, 294 

finding the largest number, 44, 46, 47, 
50 

Gray-to-binary converter, 60 

interface, 381-82 

IOU, 373, 375 

loader, 505, 507, 513, 518-24 

LSU, 385 

match operations, 273-75 

memory buffer, 300-302, 308-9 

microprogrammed computer, 90, 92, 
94, 95 

MU, 369, 372 

parallel adder, 143 

parallel fixed-point arithmetic unit, 
117, 122, 123, 128, 131, 135-38, 
154, 156, 160-64 

parallel floating-point arithmetic unit, 
172-74, 193-202 

prime number generator, 53, 56 

priority interrupt, 361-63, 366-67 

serial arithmetic unit, 211-12, 228-30 

serial comparator, 36, 38, 40 


26 

Stats, 419 
Storage: 

local, 415 

main, 414 

multiplex, 415 

read-only, 416 
Storage protection, 418, 438 
Storage through, 304 
Stored carry addition, 67 
Stored logic computer, 83, 98 
Stored program computer, 16, 368 
System units, 398 


T 


Transfer rate ratio, 310 
Translation algorithm, 502 


U 
Unit control word, 454, 458, 476, 486 
V 


Virtual address, 312 
Virtual memory, 311 


